by Sowmyah Narayanan in Information Technology
20th Aug, 2020
Kuala Lumpur, Federal Territory of Kuala Lumpur, Malaysia
Dynatrace Specialist - Malaysia
Our client is looking for Dynatrace Specialist to join their team in Kuala Lumpur, Malaysia
for the monitoring, analysis, troubleshooting and reporting for operational
performance. This includes but not limited to Infrastructure, Application,
Network and Security.
for driving performance enhancements, and leading targeted process improvement
for defining the metrics, data collection methods, and reporting mechanisms as
well as the implementation of an overall performance management strategy.
§ Ensures the
effective capture of all logging and monitoring of all aspects of system and
application behaviour to facilitate fast detection and resolution of
Application availability issues.
§ SME in
troubleshooting all performance issue across the Enterprise. This role will
work closely with IT, Application Development, Project Management and external
vendors ensuring the consistent tracking and reporting of metrics and
performance data across the Enterprise.
cost transparency efforts, and helping to develop mature cost metrics and Cost
§ Define and
maintain IT’s performance monitoring and reporting strategy (processes, tools,
& templates); develop enhanced reporting capabilities through
standardization and automation
analyze trends in performance across IT; collaborate with process owners and
stakeholders to identify and implement process improvements to increase operational
efficiency and Application availability
§ Analyze and
recommend performance improvements for capacity, availability, performance,
support and security.
informed of production changes that could affect the functionality and
§ Ability to coordinate
across teams, working closely with peers to ensure the appropriate focus and
sense of urgency is applied to all issues
using logs, alerts and external data sources to determine network, application,
or security issues. The ability to correlate data to determine the root cause.
troubleshoots, reproduces, and documents issues and other pertinent information
in Incident or Problem tickets.
incident queue and performs various tasks as assigned and determines business
§ Handles ad
hoc requests and take on new procedures as required.
§ When working
on projects, identify and track project issues and dependencies, ensure
follow-through, and appropriate actions are taken to complete the project on
implement and manage cloud Automation using native Cloud tools.
§ A minimum of five years of experience related to Performance analysis and monitoring across multiple areas including Infrastructure, Application, Network and Security for medium to large scale companies.
§ Bachelor’s degree in computer science or information systems or an equivalent combination of education, work experience and/or applicable certifications.
§ Expert knowledge of IT performance metrics. Experience with data management, report design, data visualization and presentation techniques
§ Hands-on experience using open source and commercial tools such as Load Runner/Performance Center, Jmeter, Gatling, Locust and APM tools like Dynatrace, AppDynamics, New Relic, Splunk etc.
§ .Ability to troubleshoot Application performance and monitoring issues and provide detailed analysis.
§ Ability to provide documentation that other Performance Operations Engineers can use.
§ Provide runbooks for other departments to execute.
§ Recommend ideas to streamline operations, improve operations, and create processes to proactively determine potential issues.
§ Experience with one or more Cloud platforms; Microsoft Azure, Amazon Web Service (AWS), Google Cloud or IBM Cloud as it relates to performance, monitoring and cost management.
§ Expert experience with Application and Network Performance Management Tools
§ Knowledge and understanding of microservices and web application Protocols
§ Thorough understanding of throughput, latency, memory and CPU utilization
§ Knowledge on CI/CD technologies such as Jenkins, Ansible and docker container
§ Excellent communication, collaboration, reporting, analytical and problem-solving skills
§ Design/Implementation/Integration Experience on Azure Monitors, New Relic, Splunk and Infrastructure Monitoring tool like Nagios
§ Scripting Expertise on one or more languages like Python, Power Shell, Perl
§ Integration experiences with Third-Party Monitoring (Logs/Events Triggers), Ticketing (Events/Workflows Triggers), Orchestration/Automation (Events/Workflow Triggers) Tools
§ Support solving complex performance issues, events correlation, resource optimization, tuning and/or triaging performance problems across on-premise and cloud environment
§ Collaborate and work with other senior staff to recommend and design systems architecture and topology from both general and specific perspectives.
§ Interact with IT Operation teams to communicate and understand the monitoring requirements and provide support on an on-call rotation model