< Back to our careers page

Senior Site Reliability Engineer

Full time

WHAT IS THE OPPORTUNITY?

Perform software development, refine requirements, write, test, debug, maintain, and release test software that instructs a computer to accomplish certain tasks, such as saving information, performing calculations. The Software Engineer may also be responsible for managing software development infrastructure and providing the appropriate level of functional and non-functional documentation to meeting the product and engineering requirements. As necessary, this position may be called upon to assist in performing on-site client work related to consulting projects or provide technical support as required.



Exactuals Division
This is an exciting opportunity to be part of a small "start up" group within a larger organization. Exactuals colleagues work hard to support one another in the mission to modernize the payment space, primarily for the entertainment industry, while enjoying the journey together. Exactuals became an RBC / City National company in 2018.



WHAT WILL YOU DO?

  • Improve service reliability through root cause analysis, blameless postmortems, and using code to prevent or respond to problem recurrence.
  • Be the central point of contact on the health of systems to increase product reliability and organizational efficiency.
  • Be a part of an on-call rotation for responding to customer-facing emergencies.
  • Continuous improve production incident response and triage practices
  • Document the details of incident, root cause, resolution and solution
  • Participate in planning, standup, and retrospective meetings.
  • Timely respond to incidents and to customer inquiries
  • Effective collaboration with internal and external customers
  • Participate in reviews and operational readiness for services and infrastructures
  • Maintain deep technical and business knowledge of Exactuals’ system architectures, ensuring continuous upgrade and integration of new capabilities.
  • Perform deep dives into both systemic and latent reliability issues
  • Perform service failure analysis across the entire application stack: front-end, back-end, cloud services, databases.
  • Promote modern and standard methodology for everything from monitoring to troubleshooting complex code issues.
  • Support releases activities and production environment testing
  • Design mechanisms for alerts and responses to identify and address reliability risks.
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, planning, and reviews
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Design and run performance, capacity, and monitoring tests.
  • Create educational material such as cloud native sample apps and starter code, as well as contribute to holding cloud native educational events like hackathons and live coding sessions. Create educational documentation on how-to's and best practices, and blog about use-cases and architectures that relate to cloud platforms
  • Liaise with the team managing our public cloud environments, including setup, management, and troubleshooting



WHAT DO YOU NEED TO SUCCEED


Must-Have*

  • Bachelor's Degree computer science
  • Minimum 7 years’ experience working within Software Product Development.
  • Minimum 2 year experience working in a software design or architect capacity.
  • Minimum 3 year working with cloud software deployment.
  • Minimum 3 year experience working in DevOps environments.


Skills and Knowledge

  • Strong communication skills, ownership, and drive.
  • Strong passion in diagnosis software and systems in a distributed, internet-scale Linux environment.
  • Significant expertise in enterprise Java platform and modern technologies like Spring boot, Node.js, and open source technologies
  • Ability to deep dive in existing infrastructure and software to assist in solving problems
  • Solid understanding of systems and application design, including the operational trade-offs of various designs.
  • Development or support experiences in these areas: application and data security, batch processing, finance platforms
  • Ability to deep dive in existing infrastructure and software
  • Strong cloud experience.
  • Practical knowledge of various aspects of service design like messaging protocols & behavior, caching strategies and software design practices.
  • Demonstrable knowledge of TCP/IP, HTTP, web application security, and experience supporting multi-tier web application architectures.
  • Ability to prioritize tasks and work independently.
  • Be adaptable and able to focus on the simplest, most efficient & reliable solutions.
  • BS/MS degree in Computer Science, Engineering, or equivalent experience
  • Java 8+, JavaScript, Linux proficiency
  • Spring boot, NodeJS, React or Angular
  • MySQL, Postgres
  • Dynatrace, graylog or equivalent
  • Data Processing and ETL experience
  • Full SDLC experience in enterprise development environment
  • Strong troubleshooting capabilities
  • Test automation and continuous integration
  • AWS technology experience is a must

*To be considered for this position you must meet at least these basic qualifications
The preceding job description has been designed to indicate the general nature and level of work performed by employees within this classification. It is not designed to contain or be interpreted as a comprehensive inventory of all duties, responsibilities, and qualifications required of employees assigned to this job.

Apply for this job

Max file size 10MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.