New
Senior Software Engineer
Microsoft | |
remote work | |
United States, Washington, Redmond | |
Jan 31, 2025 | |
OverviewMicrosoft is experiencing unprecedented cloud growth, and the Cloud Operations + Innovations organization (CO+I) builds and maintains the datacenters Microsoft uses worldwide. Automation is key to sustain this growth, and our team specializes in using both conventional Artificial Intelligence/Machine Learning (AI/ML) techniques and Large Language Model (LLM) to achieve a level of automation that would not be attainable with more traditional software engineering. These intelligent systems work across multiple specializations in the datacenter space. For example, they empower technicians in datacenters to service computers more rapidly and with greater accuracy and greater impact using the collective intelligence of other technicians worldwide.As a Senior Software Engineer, you will specialize in creating user interfaces primarily using React and Typescript to create AI assistants and automated workflows. You will also have opportunities to work on the back end where we use combinations of LLM's, Python, C#, Numpy, Pandas, TensorFlow, Kubernetes, Azure, web services, Linux, databases, vector databases, and other cloud technologies and libraries. You will collaborate with data scientists, customers, other engineering teams and datacenter personnel to create integrated systems that accelerate the business and delight users. This is a flexible work opportunity offering up to 100% work from home. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond. In alignment with our Microsoft values, we are committed to cultivating an inclusive work environment for all employees to positively impact our culture every day
ResponsibilitiesApplies debugging tools and examines logs, telemetry, and other methods to verify assumptions through writing and developing code proactively before issues occur and reactively as issues occur for products. Conducts retrospective debugging of solutions to identify root causes of problems.Leads by example within the team by producing extensible and maintainable. Optimizes, debugs, refactors, and reuses code to improve performance and maintainability, effectiveness, and return on investment (ROI). Applies metrics to drive the quality and stability of code, as well as appropriate coding patterns and best practices.Reviews the code of a product to assure it meets the team's and Microsoft's quality standards, is reliable and accurate, and is appropriate for the scale of the product/solution area. Applies feedback to current and future iterations. Participates in code reviews to ensure coding standards are followed. Considers diagnosability, reliability, and maintainability when reviewing code, and understands when code is ready to be shared or delivered. Applies and screens for coding patterns and best practices in reviews, and provides feedback on code to drive adherence to best practices.Leads discussions for the architecture of products/solutions and creates proposals for architecture by testing design hypotheses and helping to refine code plans. Provides reactions, proposed solutions, and inputs to architects. Partially owns solutions for architecting of solutions, with minimal technical oversight. Develops design documents for designs or User Stories, and determines the technology that will be leveraged and how it will interact. Shares learnings and identified solutions from investigations with the team and owns for some design decisions. Assures system architecture meets security and compliance requirements and expectations.Independently creates a clear and articulated plan for testing and assuring quality of solutions, and defines success for outcomes of tests (e.g., unit tests). Identifies needs for a broad versus selected approach in testing mechanisms and makes informed decisions to implement the most effective tests. Drives efforts to add new tests, remove antiquated tests, and aggregate tests to improve the test suite. Improves recommendations to the team and augments test cases to ensure that solutions have good test coverage. Drives efforts to continually integrate automation features when planning for testing.Drives identification of dependencies and the development of design documents for a product, application, service or platform. Identifies other teams and technologies that will be leveraged, how they will interact, and when one's own system may provide support to others. Determines back-end dependencies associated with product, application, service, or platform functionality for the solution/product area. Understands up and downstream effects of solutions and work provided to ensure appropriate security and performance, drives reliability in the solutions, and optimizes dependency chains and retrieves across teams. Drives identification of areas of dependency and overlap with other teams or team members and drives coordination. Communicates across teams and resolves conflicts between teams.Drives efforts to ensure the correct processes are followed to achieve a high degree of security, privacy, safety, and accessibility. Creates and assures the presence of visible evidence to demonstrate compliance for products. Develops and maintains a deep understanding of the implications of onboarding new technologies following expectations of compliance at Microsoft.Considers and drives comprehensive application of automation within production and deployment of a product. Runs code in simulated or other non-production environments to confirm functionality and error-free runtime for products. Defines and develops standardized, repeatable, scalable solutions to guarantee quality.Maintains communication with key partners across the Microsoft ecosystem of engineers. Acts as a key contact for leadership to ensure alignment with partners' expectations. Considers partner teams across own organization and their end goals for products to drive and achieve desirable user experiences and fitting dynamic needs of partners/customers through product development.Applies and extrapolates best practices to reliably build code that is based on well-established methods while also applying best practices for new code development. Demonstrates and maintains an up-to-date understanding of both global and local regulations for technologies and system applications to ensure regulations are followed and met. Drives product development and scaling to customer requirements and applies best practices for meeting scaling needs and performance expectations.Remains current in skills by investing time and effort into staying abreast of current developments. Proactively seeks new knowledge and adapts to new trends, technical solutions, and patterns that will improve the availability, reliability, efficiency, observability, and performance of products while also driving consistency in monitoring and operations at scale.Builds, enhances, reuses, contributes to, and identifies new software developer tools to support other programs and applications to create, debug, and maintain code for products. Uses open source when possible. Begins to develop skills in other tools outside areas of expertise. Identifies internal tools and creates tools that will be useful for creating the product, determining if methods are still applicable for the current solution. Shares best practices and teaches others about new tools and strategies.Creates and updates implementation framework as necessary, following industry standards. Drives implementation and deployment of the solution in the existing framework. Considers and accounts for the impact of build deployments on both users and other services. Assures that solutions are deployed safely.Leverages subject-matter expertise of product features and partners with appropriate stakeholders (e.g., project managers) to drive a workgroup's project plans, release plans, and work items. Organizes work into smaller sets of tasks as part of an overall roadmap. Guides other members for project estimation and escalates any issues that would cause a delay.Drives creation and conducting of experimentation to determine the effectiveness of changes, monitors developments for prototyping and testing products, and interprets results from experimentation.Acts as a Designated Responsible Individual (DRI) and guides other engineers by developing and following the playbook, working on call to monitor system/product/service for degradation, downtime, or interruptions. Alerts stakeholders as to status and initiates actions to restore system/product/service for simple problems and complex problems when appropriate. Responds within Service Level Agreement (SLA) timeframe. Drives efforts to reduce incident volume, looking globally at incidences and providing broad resolutions. Escalates issues to appropriate owners.Drives efforts to collect, classify, and analyze data on a range of metrics (e.g., health of the system, where bugs might be occurring). Drives the refinement of products through data analytics, and makes informed decisions in engineering products through data integration.Drives efforts to integrate instrumentation for gathering telemetry data on system behavior such as performance, reliability, availability, usage, and safety mechanisms. Drives sustaining feedback loops from telemetry resulting in subsequent designs. Creates outputs of telemetry such as notifications or dashboards.Maintains operations of live service as issues arise on a rotational, on-call basis. Implements solutions and mitigations to more complex issues impacting performance or functionality of Live Site service and escalates as necessary. Reviews and writes issues postmortem and shares insights with the team.Collaborates with appropriate stakeholders (e.g., project manager, technical lead) to determine user requirements for a scenario. Leverages a variety of feedback channels to incorporate insights into future designs or solution fixes. Ensures appropriate continuous feedback loops measuring customer value, usage patterns, and other actionable metrics of value are incorporated. |