Tech Operations Manager
Are you an experienced Technology Operations Manager with a strong background in managing large-scale L1/L2 operations ?... more info
Job Description Summary of Position We are seeking a skilled and dedicated Technical Operations Specialist to join our Information Technology team. In this role, you will be responsible for ensuring the smooth operation, performance, and security of our IT infrastructure. You will manage and optimize systems, troubleshoot technical issues, and collaborate with cross-functional teams to support our business operations. DUTIES AND RESPONSIBILITIES The following duties and responsibilities reflect the definition of essential functions for this position, but do not restrict the actual tasks that may be assigned or acquired. Executive Management may assign or reassign duties and responsibilities to this position at any time due to reasonable accommodations based on business needs. Infrastructure Management: Design and Implementation: Develop and implement scalable, secure, and robust IT infrastructure solutions to meet the evolving needs of the business. Evaluate and integrate new technologies to enhance system performance and reliability. Maintenance: Perform regular system maintenance, updates, and patches on servers, networks, and databases. Ensure hardware and software assets are inventoried and managed effectively. Capacity Planning: Monitor system capacity and performance to anticipate future infrastructure needs. Propose and implement solutions to optimize resource utilization. System Monitoring: Monitoring Tools Management: Configure and manage monitoring tools (e.g., Datadog, New Relic, App Dynamics, etc.), to oversee system health and performance metrics. Proactive Issue Detection: Set up alerts and thresholds to identify potential issues before they impact operations. Analyze system logs and monitoring data to detect anomalies or patterns indicative of underlying problems. Incident Response: First Responder: Act as a point of contact for technical incidents and outages. Troubleshooting: Diagnose and resolve hardware, software, and network issues in a timely manner. Utilize problem-solving methodologies to identify root causes. Post-Incident Analysis: Conduct thorough post-mortems to document incidents and implement strategies to prevent recurrence. Communication: Keep stakeholders informed during incidents through timely and clear communication. Deployment and Automation: Software Deployment: Coordinate and manage the deployment of new applications and updates to production environments. Ensure deployments are carried out with minimal disruption to services. Automation Scripts: Develop and maintain automation scripts using Python, Bash, or PowerShell to automate repetitive tasks. Implement Infrastructure as Code (IaC) practices using tools like Terraform or Ansible. Process Improvement: Identify opportunities to streamline deployment processes and reduce manual intervention. Security and Compliance: Policy Implementation: Enforce security policies and procedures to protect company data and infrastructure. Compliance Management: Ensure systems and processes comply with industry regulations ISO 27001, NIST, and other relevant standards. Vulnerability Management: Conduct regular security assessments, vulnerability scans, and penetration testing. Implement security patches and updates promptly. Access Control: Manage user access rights and permissions to systems and data. Monitor for unauthorized access attempts and respond accordingly. Collaboration: Cross-Functional Work: Collaborate with development teams to optimize application performance and scalability. Work with product teams to understand upcoming changes and prepare the infrastructure accordingly. Support: Provide technical support to customer service teams for escalated issues. Assist in the training of staff on new technologies or processes. Documentation: Technical Documentation: Create and maintain detailed documentation of system configurations, procedures, and processes. Knowledge Sharing: Develop and update knowledge base articles for common issues and resolutions. Reporting: Generate regular reports on system performance, incidents, and operational metrics. Continuous Improvement: Process Evaluation: Regularly review and assess operational processes for efficiency and effectiveness. Best Practices Implementation: Stay informed about industry best practices and emerging technologies. Recommend and implement improvements to systems and processes. Innovation: Propose innovative solutions to enhance service delivery and operational excellence. Disaster Recovery and Business Continuity: Planning: Develop and maintain disaster recovery and business continuity plans. Testing: Conduct regular drills and tests to ensure plans are effective and up-to-date. Recovery Execution: Work with teams for recovery efforts in the event of a system failure or disaster. Vendor Management: Coordination: Work with external vendors and service providers to support infrastructure needs. Evaluation: Assess vendor performance and negotiate contracts and service-level agreements. QUALIFICATIONS (required): Bachelor’s degree in computer science, Information Systems, or a related field. Master's degree a plus. 3-5 years of experience in a technical operations or IT infrastructure role. Proven work experience in managing complex IT environments, preferably in the financial technology sector. In-depth knowledge and experience in scripting for automation (Python, Bash, Powershell), and automation tools and configuration management (Ansible, Puppet, Chef). Strong understanding of network protocols, firewalls, VPNs, and load balancers. Proficiency in performing database backup, recovery and optimization. Experience with cloud services (AWS, Azure, GCP), including compute, storage, networking and security components. Familiarity with relational and non-relational databases (MySQL, PostgreSQL, MongoDB). Familiarity with monitoring and logging tools (Prometheus, Grafana, ELK Stack, Splunk). Expert experience in Linux and Windows server administration. Knowledge of virtualization technologies (VMWare, Hyper-V). Strong problem-solving, analytical and troubleshooting skills. Excellent communication and collaboration abilities, capable of conveying complex technical information clearly. Relevant certifications (e.g., AWS Certified Solutions Architect, Microsoft Certified: Azure Administrator, CompTIA Security+, or equivalent) are a plus. Physical Requirements: Ability to be on call 24/7. Ability to commute to the corporate office. Multi-limb and eye-hand coordination. Able to stand, bend, reach, stoop and lift boxes up to 30 lbs. Able to sit at desk, working on computer for a full work day. Able to work in a fast-paced environment / multi-tasking with organization and efficiency. #J-18808-Ljbffr
Are you an experienced Technology Operations Manager with a strong background in managing large-scale L1/L2 operations ?... more info
Technical Operations Supervisor We are seeking a highly skilled Technical Operations Supervisor to lead our Concrete Field... more info
Introduction As part of a team based on client site in Melbourne CBD, this role is responsible for overseeing delivery of... more info