In the SDLC, deployment is the final lever that must be pulled to make an application or system ready for use. Whether it's a bug fix or new release, the deployment phase is the culminating event to see how something works in production. This Zone covers resources on all developers’ deployment necessities, including configuration management, pull requests, version control, package managers, and more.
Revolutionizing Software Deployment: The Synergy of Cloud and DevOps
Ansible Beyond Automation
Recently, I mentioned how I refactored the script that kept my GitHub profile up-to-date. Since Geecon Prague, I'm also a happy owner of a Raspberry Pi: Though the current setup works flawlessly — and is free, I wanted to experiment with self-hosted runners. Here are my findings. Context GitHub offers a large free usage of GitHub Actions: GitHub Actions usage is free for standard GitHub-hosted runners in public repositories, and for self-hosted runners. For private repositories, each GitHub account receives a certain amount of free minutes and storage for use with GitHub-hosted runners, depending on the account's plan. Any usage beyond the included amounts is controlled by spending limits. — About billing for GitHub Actions Yet, the policy can easily change tomorrow. Free tier policies show a regular trend of shrinking down when: A large enough share of users use the product, lock-in Shareholders want more revenue A new finance manager decides to cut costs The global economy shrinks down A combination of the above Forewarned is forearmed. I like to try options before I need to choose one. Case in point: what if I need to migrate? The Theory GitHub Actions comprise two components: The GitHub Actions infrastructure itself.It hosts the scheduler of jobs. Runners, who run the jobs By default, jobs run on GitHub's runners. However, it's possible to configure one's job to run on other runners, whether on-premise or in the Cloud: these are called self-hosted runners. The documentation regarding how to create self-hosted runners gives all the necessary information to build one, so I won't paraphrase it. I noticed two non-trivial issues, though. First, if you have jobs in different repositories, you need to set up a job for each repository. Runner groups are only available for organization repositories. Since most of my repos depend on my regular account, I can't use groups. Hence, you must duplicate each repository's package on the runner's Pi. In addition, there's no dedicated package: you must untar an archive. This means there's no way to upgrade the runner version easily. That being said, I expected the migration to be one line long: YAML jobs: update: #runs-on: ubuntu-latest runs-on: self-hosted It's a bit more involved, though. Let's detail what steps I had to undertake in my repo to make the job work. The Practice GitHub Actions depend on Docker being installed on the runner. Because of this, I thought jobs ran in a dedicated image: it's plain wrong. Whatever you script in your job happens on the running system. Case in point, the initial script installed Python and Poetry. YAML jobs: update: runs-on: ubuntu-latest steps: - name: Set up Python 3.x uses: actions/setup-python@v5 with: python-version: 3.12 - name: Set up Poetry uses: abatilo/actions-poetry@v2 with: poetry-version: 1.7.1 In the context of a temporary container created during each run, it makes sense; in the context of a stable, long-running system, it doesn't. Raspbian, the Raspberry default operating system, already has Python 3.11 installed. Hence, I had to downgrade the version configured in Poetry. It's no big deal because I don't use any specific Python 3.12 feature. TOML [tool.poetry.dependencies] python = "^3.11" Raspbian forbids the installation of any Python dependency in the primary environment, which is a very sane default. To install Poetry, I used the regular APT package manager: Shell sudo apt-get install python-poetry The next was to handle secrets. On GitHub, you set the secrets on the GUI and reference them in your scripts via environment variables: YAML jobs: update: runs-on: ubuntu-latest steps: - name: Update README run: poetry run python src/main.py --live env: BLOG_REPO_TOKEN: ${{ secrets.BLOG_REPO_TOKEN } YOUTUBE_API_KEY: ${{ secrets.YOUTUBE_API_KEY } It allows segregating individual steps so that a step has access to only the environmental variables it needs. For self-hosted runners, you set environment variables in an existing .env file inside the folder. YAML jobs: update: runs-on: ubuntu-latest steps: - name: Update README run: poetry run python src/main.py --live If you want more secure setups, you're on your own. Finally, the architecture is a pull-based model. The runner constantly checks if a job is scheduled. To make the runner a service, we need to use out-of-the-box scripts inside the runner folder: Shell sudo ./svc.sh install sudo ./svc.sh start The script uses systemd underneath. Conclusion Migrating from a GitHub runner to a self-hosted runner is not a big deal but requires changing some bits and pieces. Most importantly, you need to understand the script runs on the machine. This means you need to automate the provisioning of a new machine in the case of crashes. I'm considering the benefits of running the runner inside a container on the Pi to roll back to my previous steps. I'd be happy to hear if you found and used such a solution. In any case, I'm not migrating any more jobs to self-hosted for now. To Go Further About billing for GitHub Actions About self-hosted runners Configuring the self-hosted runner application as a service
Fargate is a serverless compute engine for containers that works with both Amazon ECS and Amazon EKS. With AWS Fargate, we can run applications without managing servers (official information page). In this post, we will take a step-by-step approach to deploying and running a .NET Core Web API application on AWS Fargate Service. Typical Use Cases for Fargate Fargate supports all of the common container use cases, including microservices architecture applications, batch processing, machine learning applications, etc. Application For the application, I’ll be using a .NET Core Web API application. But if you have a Java application or server application written in another programming language, most of the deployment information will still apply. The following picture shows a .NET Core Web API application using Visual Studio. Once the project is created, I add a token controller: The token controller has one simple HTTPGet method as follows: Now, we can run the application from Visual Studio. The following Swagger UI shows up and we can see the token endpoint and test it. This is a basic web API with very limited functionality, but that’s ok for our demo purposes. Docker Support Next, I’ve added a Dockerfile to the solution, which we can use to run applications locally inside a container and also can use it to publish an image to Docker Hub. The following picture shows the application running in a container on my local machine. After running it locally and testing it, we can save the image to Docker Hub or AWS Elastic Container Registry (ECR) so that we can use it in AWS Fargate to run the containers from it. The application source code is available on this Git repository. I will be using Docker Hub, but feel free to select ECR, as per your requirements. The following picture shows that the image is available from Docker Hub. At this point, we have a .NET core web API application, packaged as a Docker image and available on Docker Hub. Next, let's use AWS Fargate to select this image to run the application. As mentioned earlier, Fargate is a serverless compute engine for containers. AWS Fargate Terms Let's get ourselves familiarized with a few terms surrounding AWS Fargate and container services. An Amazon ECS cluster is a logical grouping of tasks or services. We can use clusters to isolate our applications. This way, they don’t use the same underlying infrastructure. When our tasks are run on Fargate, the service also manages our cluster resources. To deploy applications on Amazon ECS, our application components must be configured to run in containers. A task definition is a text file in JSON format that describes one or more containers that form the application. We can use it to describe up to a maximum of 10 containers. A task is the instantiation of a task definition in a cluster. After we create a task definition for our application in Amazon ECS, we can specify the number of tasks to run on our cluster. We can run a standalone task, or run a task as part of a service. We can use an Amazon ECS service to run and maintain the desired number of tasks simultaneously in an Amazon ECS cluster. Create an Amazon ECS Cluster That Uses Fargate Log in to the AWS web console, open the Amazon ECS dashboard, and click the Create Cluster button. It will open a form similar to the following. The following picture shows the entered cluster name and AWS Fargate as the selected infrastructure: Click the Create Cluster button on the bottom of the form, and we’ll have a cluster provisioned for our workloads: That’s all that was needed to set up a cluster. Deploy a Container Using Fargate Let's start by creating a new task definition on the ECS web console as shown below. Provide the following information: For the infrastructure section, kept the defaults: For the container section, I input some details; e.g., the container name (tokengen) and the image URI which is the location of the image in a registry (here, it is pointing to the image on Docker Hub). We can also specify the port, which in this case, is 5000. Keep other defaults and click Create. That should result in the following definition of creation: So, a task definition is created, but no containers are running yet. We can now Create a service as follows. Select the definition, and Create service from the Deploy button as shown below. The following is a screenshot for Create service: Here, I selected 3 numbers for desired tasks (meaning it will run 3 containers of tokengenwebapi). To distribute traffic among three instances, we can set up a load balancer as follows: We can set up a target group for the load balancer to route traffic. For a health check, we can specify an endpoint as shown below: Networking sections allow us to select VPC, Subnets, and Security Groups as shown below: Click Create when you have reviewed the choices. The following picture shows the service status: Soon, the UI will be updated to reflect the task status. The following picture shows that all 3 tasks are running: We can check the logs from Fargate. The following picture shows the network configuration: The load balancer is set up with a DNS name, and we can use it to access our application running in a container. Make sure that the security group is set to allow inbound traffic if you want to allow public access to the application. The following picture shows the token endpoint accessed via the browser: If I refresh the page, we can see the issuer info and all three instances are responding to incoming HTTP requests. CLI Commands If you have the AWS CLI setup, we use CLI commands to check and update the Fargate infrastructure. Retrieve Cluster Information aws ecs describe-clusters --cluster DevCluster Scale-Out the Amazon ECS Service When we create the Amazon ECS service, it includes three Amazon ECS task replicas. We can see this by using the describe-services command, which returns three. Use the update-service command to scale the service to five tasks. Re-run the describe-services command to see the updated five. aws ecs describe-services --cluster DevCluster --services tokengensrv --query 'services[0].desiredCount' aws ecs update-service --cluster DevCluster --service tokengensrv --desired-count 5 Rolling Update During my testing, I updated the image on Docker Hub and wanted to redeploy the application with new updates. One way to do that is to create a revision of the task definition and then update the service. As you can see, all three containers were running, and then a new revision activity was started. A few minutes later, we could see that all three tasks were updated and running. This all happened seamlessly without any effort on our part, and all that time the web API application was available. The following picture shows the application is accessed using a load balancer DNS address: With this, we will end this post. The application source code is available on the previously linked Git repository. Summary Fargate is a serverless compute engine for containers that works with both Amazon Elastic Container Service and Amazon Elastic Kubernetes Service. In this post, we covered that with AWS Fargate, we can run applications without managing servers. We saw a step-by-step demo of deploying and running a .NET Core Web API application on Fargate in a highly available environment. All it needed was a Dockerfile for our application, and then Fargate was able to pull the image and run the application from it. Let me know if you have any questions or comments. Till next time, happy coding.
In the rapidly evolving landscape of Kubernetes, security remains at the forefront of concerns for developers and architects alike. Kubernetes 1.25 brings significant changes, especially in how we approach pod security, an area critical to the secure deployment of applications. This article dives deep into the intricacies of Pod Security Admission (PSA), the successor to Pod Security Policies (PSP), providing insights and practical guidance to harness its potential effectively. Understanding Pod Security Admission With the deprecation of Pod Security Policies in previous releases, Kubernetes 1.29 emphasizes Pod Security Admission (PSA), a built-in admission controller designed to enforce pod security standards at creation and modification time. PSA introduces a more streamlined, understandable, and manageable approach to securing pods, pivotal for protecting cluster resources and data. PSA Basics PSA operates on the principle of predefined security levels: privileged, baseline, and restricted. These levels provide a clear framework for securing your pods based on the security posture you need: Privileged: This level is essentially unrestricted and should be used sparingly, as it exposes pods to significant security vulnerabilities. Baseline: A moderate level that provides protection against known privilege escalations while maintaining broad compatibility with existing applications. Restricted: This level applies a rigorous set of security standards, minimizing the attack surface and enforcing best practices. Implementing Pod Security Admission To utilize PSA effectively, it's essential to understand its configuration and implementation process. Let's walk through the steps to enforce pod security standards within a Kubernetes cluster. Step 1: Enable Pod Security Admission Ensure your Kubernetes cluster is running version 1.25 or later. PSA is enabled by default, but it's crucial to verify its activation: YAML apiVersion: admissionregistration.k8s.io/v1 kind: ValidatingWebhookConfiguration metadata: name: "podsecurity.webhook.admission.k8s.io" Step 2: Define Namespace Labels PSA uses namespace labels to determine the security level for pods within that namespace. Define your desired level by labeling each namespace: kubectl label ns <namespace> pod-security.kubernetes.io/enforce=baseline This example sets the security level to baseline for the specified namespace. Step 3: Configuring the Pod Security Standards Configuration at the namespace level allows for flexibility and granularity in security enforcement. For instance, to apply the restricted level, you would update the namespace configuration as follows: kubectl label ns <namespace> pod-security.kubernetes.io/enforce=restricted Practical Example: Deploying a Secure Pod Let's illustrate how to deploy a pod that complies with the restricted security level. This example assumes you've already labeled your namespace as restricted. Secure Pod Manifest YAML apiVersion: v1 kind: Pod metadata: name: secure-example spec: securityContext: runAsNonRoot: true seccompProfile: type: RuntimeDefault containers: - name: secure-container image: nginx:stable securityContext: allowPrivilegeEscalation: false capabilities: drop: ["ALL"] This manifest defines a pod that adheres to restricted standards, ensuring it runs as a non-root user and disables privilege escalation. Best Practices for Pod Security Adopting PSA necessitates a shift in how we approach pod security. Here are key best practices to consider: Gradual adoption: Start with privileged, move to baseline, and aim for restricted to minimize disruption. Audit and monitor: Utilize the audit and warn modes to identify non-compliant resources without enforcing changes. Continuous education: Keep your team informed about the latest security features and practices in Kubernetes. Conclusion As Kubernetes continues to mature, its security mechanisms evolve to offer more robust protections and simpler management. Pod Security Admission in Kubernetes 1.25+ represents a significant step forward in securing containerized environments, providing clear guidelines and practical tools for developers and architects. By understanding and implementing these new standards, you can significantly enhance the security posture of your Kubernetes deployments. Embracing these changes not only secures your applications but also aligns your security practices with the cutting-edge developments in Kubernetes. As we navigate this shift, the importance of adapting and continuously learning cannot be overstated—our journey towards more secure, efficient, and reliable container orchestration continues.
In the age of Big Data and Artificial Intelligence (AI), effectively managing and deploying machine learning (ML) models is essential for businesses aiming to leverage data-driven insights. PostgresML, a pioneering framework, seamlessly integrates ML model deployment directly into PostgreSQL, a widely used open-source relational database management system. This integration facilitates the effortless deployment and execution of ML models within the database environment, eliminating the need for intricate data pipelines and external services. Introduction Artificial Intelligence (AI) and Machine Learning (ML) have emerged as transformative technologies, enabling systems to learn from data, adapt to new inputs, and perform tasks without explicit programming. At the core of AI and ML are models, mathematical representations of patterns and relationships within data, which are trained to make predictions, classify data, or generate insights. However, the journey from model development to deployment poses unique challenges. Model deployment involves integrating trained models into operational systems or applications, allowing them to make real-time decisions and drive business value. Yet, this process is not without complexities. One challenge is the management and scalability of deployed models across diverse environments, such as cloud platforms, edge devices, or on-premises infrastructure. Additionally, ensuring the reliability, security, and performance of deployed models in dynamic environments is critical. Integrating models seamlessly into existing software systems while minimizing disruption and maintaining compatibility further complicates the deployment process. Furthermore, the need for continuous monitoring, updating, and versioning of deployed models to adapt to evolving data distributions and business requirements presents ongoing challenges. Despite these hurdles, overcoming the challenges of AI/ML model deployment is essential for unlocking the full potential of AI and ML in driving innovation and solving real-world problems. PostgresML Architecture PostgresML, a revolutionary framework, extends the capabilities of PostgreSQL by introducing a sophisticated suite of features aimed at streamlining the deployment and execution of machine learning (ML) models within the database environment. At its core, PostgresML comprises three main components, each playing a crucial role in the seamless integration of ML workflows with the PostgreSQL ecosystem: Figure 1: PostgresML Architecture Model storage in PostgreSQL: PostgresML provides a dedicated schema within the PostgreSQL database for the purpose of storing ML models. This schema serves as a centralized repository for storing all essential components of ML models, including metadata, hyperparameters, and serialized model artifacts. By leveraging PostgreSQL's robust storage capabilities, PostgresML ensures that ML models are securely and efficiently managed alongside other database objects. Integration with PostgreSQL's query execution engine: One of the key innovations introduced by PostgresML is its seamless integration with PostgreSQL's query execution engine. By embedding ML model execution directly within SQL queries, PostgresML enables users to leverage the full power of their existing database infrastructure for executing ML predictions. This integration eliminates the need for complex data pipelines or external services, thereby reducing latency and simplifying the overall deployment process. Model management APIs for simplified deployment: PostgresML exposes a comprehensive set of APIs designed to facilitate the management and deployment of ML models within the PostgreSQL environment. These APIs encompass a wide range of functionalities, including model training, evaluation, and deployment. By providing developers with a familiar SQL-based interface, PostgresML empowers them to interact with ML models using standard database operations, thereby streamlining the deployment process and accelerating the development of data-driven applications. Traditional ML Deployment Approaches PostgresML, a cutting-edge framework for integrating machine learning (ML) model deployment within PostgreSQL, offers several distinctive features that set it apart from traditional ML deployment approaches: Native Integration With PostgreSQL One of the standout features of PostgresML is its seamless integration with PostgreSQL, the popular open-source relational database management system. By embedding ML model deployment directly within PostgreSQL, PostgresML eliminates the need for complex data pipelines or external services. This native integration not only reduces latency and overhead but also simplifies the overall deployment process, allowing organizations to leverage their existing database infrastructure for ML tasks. SQL Interface for Model Management PostgresML provides a user-friendly SQL-based interface for managing ML models, making it accessible to developers and data scientists familiar with SQL syntax. This interface enables users to perform various ML-related tasks, including model training, evaluation, and deployment, using standard database operations. By leveraging familiar tools and workflows, PostgresML empowers users to seamlessly integrate ML workflows into their existing database environments, enhancing productivity and collaboration. Scalability With Horizontal Scaling Leveraging PostgreSQL's distributed architecture, PostgresML is designed to scale horizontally to accommodate large datasets and high-throughput workloads. By distributing data and computation across multiple nodes, PostgresML ensures that ML tasks can be executed efficiently and effectively, even as data volumes grow. This scalability enables organizations to deploy ML models at scale without compromising performance or reliability, making PostgresML an ideal solution for handling the demands of modern data-driven applications. Robust Security Features PostgresML inherits PostgreSQL's robust security features, ensuring that ML models and data are protected against unauthorized access and tampering. By leveraging PostgreSQL's advanced security mechanisms, including role-based access control (RBAC), data encryption, and auditing capabilities, PostgresML provides organizations with the confidence that their sensitive ML assets are safeguarded against potential threats. This built-in security framework makes PostgresML a trusted platform for deploying mission-critical ML applications in a secure and compliant manner. Example Usage To provide a comprehensive demonstration of PostgresML's capabilities in deploying machine learning (ML) models, let's delve into a detailed example scenario: In this illustrative example, we initiate the process by creating a table named `iris_data` within the PostgreSQL database schema, designed to store training data for an ML model. Each row in this table represents a sample observation of iris flower characteristics, including sepal and petal dimensions, along with the corresponding species label. Following the creation of the table, we populate it with sample data entries to facilitate model training. The subsequent step involves the utilization of the `CREATE MODEL` statement, a core feature of PostgresML, to train a logistic regression model named `iris_model`. This model is trained on the provided training data stored in the `iris_data` table. The logistic regression algorithm, specified as the model function, is employed to learn the underlying patterns and relationships within the training data, thereby enabling the model to make predictions based on new input instances. Finally, we demonstrate the practical utility of the trained ML model by making predictions on a separate testing dataset (`testing_data`). Leveraging the `PREDICT` function provided by PostgresML, we apply the trained `iris_model` to generate predictions of the iris species for each observation in the testing dataset. The resulting predictions are retrieved alongside the input features (sepal and petal dimensions), facilitating further analysis and evaluation of the model's performance. In essence, this example showcases the seamless integration of ML model training and deployment within the PostgreSQL environment facilitated by PostgresML. By leveraging familiar SQL syntax and database functionalities, developers and data scientists can effectively harness the power of machine learning without the need for specialized tools or external services, thereby streamlining the development and deployment of ML applications. Comprehensive Performance Evaluation of PostgresML Against Traditional ML Deployment Approaches To provide a thorough assessment of PostgresML's performance capabilities, a comprehensive series of experiments was meticulously conducted, comparing its performance against traditional machine learning (ML) deployment approaches. These experiments focused on evaluating key performance metrics such as latency, throughput, and scalability, with a particular emphasis on assessing PostgresML's suitability for large-scale deployments. The experimental setup encompassed the execution of a diverse range of workload scenarios, each representing various levels of data complexity and processing demands. These scenarios were carefully designed to simulate real-world ML deployment tasks, including model training, inference, and evaluation. Both PostgresML and traditional ML deployment approaches underwent rigorous testing under controlled conditions, facilitating a direct and unbiased comparison of their performance characteristics. Upon completion of the experiments, an extensive analysis of the results was conducted to assess PostgresML's performance relative to traditional ML deployment approaches. The findings revealed consistent and significant performance improvements across all evaluated metrics, including reduced latency, increased throughput, and enhanced scalability. Notably, PostgresML demonstrated superior performance capabilities, particularly in large-scale deployments. Furthermore, the experiments underscored the robustness and reliability of PostgresML under varying workload conditions, highlighting its ability to efficiently handle high-volume data processing tasks with minimal overhead. This scalability and resilience can be attributed to PostgresML's seamless integration with PostgreSQL's distributed architecture, which enables it to leverage the parallel processing capabilities of distributed database systems for optimal performance. Figure 2: Comparison of Latency between PostgresML and Traditional Approaches In summary, the performance evaluation of PostgresML showcases its effectiveness in addressing the challenges of ML deployment, particularly in large-scale settings. The results affirm PostgresML's position as a powerful and reliable solution for organizations seeking to harness the full potential of AI-driven insights. For a visual representation of the performance comparison, refer to Figure 2: Comparison of Latency between PostgresML and Traditional Approaches, which illustrates PostgresML's superior performance across varying dataset sizes. Conclusion In conclusion, PostgresML stands at the forefront of innovation in machine learning (ML) deployment and management, offering a revolutionary approach that seamlessly integrates AI capabilities into the database environment. By leveraging the robust features of PostgreSQL, PostgresML streamlines the entire ML lifecycle, from data preparation to model deployment, offering unprecedented efficiency and ease of use. Looking ahead, the future of PostgresML holds immense potential for further advancements, including scalability enhancements, performance optimizations, and the expansion of its application domains across various industries. As businesses increasingly rely on data-driven insights to fuel their decision-making processes, PostgresML emerges as a pivotal tool for unlocking the full potential of AI-driven analytics and driving innovation in organizational workflows. Readers are encouraged to explore the world of PostgresML and discover its vast possibilities for transforming data workflows and accelerating business growth. By embracing PostgresML, organizations can tap into the power of AI-driven insights and gain a competitive edge in today's data-centric landscape.
Editor's Note: The following is an article written for and published in DZone's 2024 Trend Report, The Modern DevOps Lifecycle: Shifting CI/CD and Application Architectures. Thirty years later, I still love being a software engineer. In fact, I've recently read Will Larson's "Staff Engineer: Leadership beyond the management track," which has further ignited my passion for solving complicated problems programmatically. Knowing that employers continue to accommodate the staff, principle, and distinguished job classifications provides a breath of fresh air for technologists who want to thrive as an engineer. Unfortunately, with the good sometimes comes the not-so-good. For today's software engineer, the reality isn't quite so ideal, as Toil continues to find a way to disrupt productivity on a routine basis. One common example is when it comes to deploying our artifacts — especially into production environments. It's time to place a higher priority on deployment automation. The Traditional Deployment Lifecycle The development lifecycle for a software engineer typically centers around three simple steps: develop, review, and merge. Building upon these steps, the following flowchart illustrates a traditional deployment lifecycle: Figure 1. Traditional development lifecycle In Figure 1, a software engineer introduces an update to the underlying source code. Once a merge request is created, the continuous integration (CI) tooling executes unit tests and performs static code analysis. If these steps are completed successfully, a second software engineer performs a code review for the changes. If those changes are approved, the original software engineer merges the source code changes into the main branch. At this point, the software engineer starts a deployment to the development environment (DEV), which is handled by the continuous delivery (CD) tooling. In this example, the release candidate is deployed to dev and additional tests (like regression tests) are executed. If both steps pass, the software engineer initiates a deployment into the QA environment via the same CD tooling. Next, the software engineer creates a change ticket to release the source code update into the production environment (Prod). Once the approving manager approves the change ticket, the software engineer initiates a deployment into Prod. This step instructs the CD tooling to perform the Prod deployment. Unfortunately, there are several points in the flow where human-based tasks are involved. Time to Focus on Toil Elimination Google Site Reliability Engineering's Eric Harvieux defined Toil as noted below: "Toil is the kind of work that tends to be manual, repetitive, automatable, tactical, devoid of enduring value, and that scales linearly as a service grows." Software engineers should alter their mindset to become cognizant on identifying Toil in their roles and responsibilities. Once Toil has been acknowledged, tasks should be established to eliminate these items that do not foster productivity. Most Agile teams reserve 20% of sprint capacity for backlog tasks. Toil elimination is always a perfect candidate for such work. In Figure 1, the following tasks were handled manually and should be viewed as Toil: Start DEV Deployment Start QA Deployment Create Change Ticket Manager Approve Change Ticket Start Prod Deployment In order to drive toward next-gen deployment lifecycles, it is important to become Toil-free. DevOps Lifecycle and Deployment Automation While Toil elimination is an important aspect to next-gen deployment lifecycles, deployment automation via DevOps is equally as important. Using DevOps pipelines, we can automate the deployment flow as noted below: Create the release candidate image when the merge-to-main event is completed. Automate the deployment to DEV when a new release candidate is created. Continue to deploy to QA upon successful deployment to DEV. Create the change ticket programmatically once QA deployment is successful. In implementing the automation noted above, three of the five human-based tasks are eliminated. In order to mitigate the remaining two tasks, the observability platform can be leveraged. Service owners often rely on their observability platform to support and maintain applications running in production. By extending the coverage to include the lower environments (like DEV and QA), it is possible for DevOps pipelines to interact with metrics being emitted during the deployment lifecycle using an open-source tool such as Ansible. This means that as the DevOps pipelines are making changes to an environment, an Ansible Playbook can be created to monitor a given set of metrics in order to know if the deployment is running as expected. If no anomalies or errors surface, the pipeline will continue running. Otherwise, the current task will abort and the prior state of the deployment will be restored. As a result, using a collection of metrics defined by the service owner and the observability platform, the need for manager approval becomes diminished. This is because the approval of the merge request is where the change was analyzed. Additionally, the approving manager step often was added because a better alternative did not exist. With the manager approval step replaced, the deployment to Prod can be triggered by the same DevOps pipeline. In taking this approach, the status of the change ticket can reflect the actual status as tasks are completed by the automation. Example statues include Created, To Be Reviewed, Approved, Started, In Progress, and Completed (or Completed With Errors). Next-Gen Deployment Lifecycle By eliminating Toil and introducing DevOps automation via pipelines, a next-gen deployment lifecycle can be created. Figure 2. Next-gen deployment lifecycle In Figure 2, the deployment lifecycle becomes much smaller and no longer requires the approving manager role. Instead, the observability platform is leveraged to monitor the DevOps pipelines. With the next-gen deployment lifecycle, the software engineer performs the merge-to-main step after the merge request has been approved. From this point forward, the remainder of the process is completely automated. If any errors occur during the CD pipeline steps, the pipeline will stop and the prior state will be restored. Compared to Figure 1, all of the existing Toil has been completely eliminated and teams can get into the mindset that a merge-to-main event is the entry point to the next production release. What's even more exciting is the improvement that teams will see with their commit-to-deploy ratios in adopting this strategy. Shattering Unjustified Blockers When considering next-gen deployment lifecycles, three common thoughts are often raised: 1. We Need to Let the Business Know Before We Can Deploy Software engineers should strive to enhance or update services in a manner where business-level approval is not a requirement. The use of feature flags and versioned URIs are examples of how automated releases can be achieved without impacting existing customers. However, it is always a great idea to communicate what features and fixes are planned — along with the expected time frames. 2. The Manager Should Know What Is About to Be Deployed While this is a fair statement, the approving manager's knowledge of the update should be established during the sprint planning stage (or similar). Once a given set of work begins, the expectation is that the work will be completed and deployed during the given development iteration. Like software engineers, managers should adopt the mindset that merge-to-main ultimately results in a deployment to production. 3. At Least One Person Should Approve Changes Before They Are Pushed to Production This is a valid statement, and it actually occurs during the merge request stage. In fact, the remaining approval in the next-gen deployment lifecycle is where it is for a very good reason. When one or more approvers review a merge request, they are in the best position — at the best point in time — to review and challenge the work that is being completed. Thereafter, it makes far better sense for the observability platform to monitor the DevOps pipelines for any unexpected issues. Conclusion The traditional development lifecycle often includes human-based approvals and an unacceptable amount of Toil. This Toil not only becomes a source of frustration but also impacts the productivity and mental health of the software engineer over time. Teams should make it a priority to eliminate Toil in their roles and responsibilities and drive toward next-gen development lifecycles using DevOps pipelines and integrating with existing observability platforms. Taking this approach will allow teams to adopt a "merge-to-main equals deploy-to-Prod" mindset. In doing so, commit-to-deploy ratios will improve as a nice side effect. Thirty years ago, I found my passion as a software engineer, and 30 years later, I still love being a software engineer. In fact, I am even more excited for the path ahead, free from human-based approvals due to DevOps automation and Toil elimination. Have a really great day! Resources: "Staff Engineer: Leadership beyond the management track" by Will Larson, 2021 "Identifying and Tracking Toil Using SRE Principles" by Eric Harvieux, 2020 "Monitoring as code with Sensu + Ansible" by Jef Spaleta, 2021 This is an excerpt from DZone's 2024 Trend Report,The Modern DevOps Lifecycle: Shifting CI/CD and Application Architectures.For more: Read the Report
Kubernetes stands out as the quintessential solution for managing containerized applications. Despite its popularity, establishing a Kubernetes cluster remains an intricate process, mainly when aiming for a high-availability configuration. This blog post will navigate through the process of constructing a multi-master Kubernetes cluster on AWS using Kops, a potent open-source tool that simplifies cluster deployment. By the conclusion of this tutorial, you will possess the expertise to initiate your resilient, production-grade Kubernetes environment. Understanding the Essentials Before we embark on our journey, preparing the tools and access required for a seamless setup process is vital. You will need an active AWS account with appropriate permissions for creating and managing resources such as EC2 instances, VPCs, and Route53 zones. Additionally, command-line access is crucial; thus, the AWS CLI should be installed and configured with the necessary access credentials. The cornerstone of this guide is the Kops and Kubectl tools. Kops is instrumental in cluster creation, while Kubectl is essential for communication with the Kubernetes cluster. For those contemplating setting up a production cluster, owning a domain in AWS Route53 is advisable, although not compulsory for test configurations. High Availability Demystified A multi-master setup, synonymous with High Availability (HA) clusters, is pivotal in ensuring your Kubernetes cluster operates with multiple master nodes. This strategy is indispensable for production environments, guaranteeing the cluster's continuous functionality, even in the event of a master node failure, thus eliminating a single point of failure. Crafting the Environment Integrating Route53 Domain Though optional, integrating a Route53 domain is recommended for production environments. This integration involves registering a new domain or configuring a hosted zone for a pre-existing one. Recording the domain name is crucial, as it forms the cluster's base URL. Establishing an S3 Bucket for Kops State Storage Kops necessitates a "state store," which is facilitated by an S3 bucket for the storage of cluster states and configurations. It's imperative to activate versioning within the S3 bucket, safeguarding against unintentional deletions and enabling convenient rollbacks. Configuring Environment Variables Environment variables streamline the process by storing data that can be reused throughout the session. Set the KOPS_CLUSTER_NAME with your domain and KOPS_STATE_STORE with your S3 bucket's URL, ensuring your commands know your cluster name and where to store Kops' state files. Installing and Configuring Kops and Kubectl Initiating Kops Installation Begin by installing Kops. You can do this by downloading the latest release from their GitHub page or using package managers like Homebrew for macOS or Chocolatey for Windows. The process might slightly differ based on the operating system you are using. Deploying Kubectl Following Kops, the next step is installing kubectl. This tool is vital as it allows you to interact with your Kubernetes cluster. Similar to Kops, you can download Kubectl from its official website or use package managers for installation. Launching the Cluster Creating Cluster Configuration With the environment now ready, invoke the following command to create a cluster configuration: “kops create cluster --node-count=3 --node-size=t2.medium --zones=us-west-2a --name=${KOPS_CLUSTER_NAME} --master-size=t2.medium --master-count=3” This command instructs Kops to initiate a cluster configuration with three worker nodes and three master nodes distributed across the specified AWS zones. The instance sizes for the nodes are also defined here. Notably, the settings, such as zones and instance sizes, should align with your project and budget requirements. Reviewing and Modifying the Cluster Manifest Before applying the configuration, review and, if necessary, modify the cluster manifest file. To inspect the configuration, use the command kops edit cluster ${KOPS_CLUSTER_NAME}. This step is crucial for fine-tuning configurations, such as networking models or enabling certain features. Deploying the Cluster Upon finalizing your configuration, deploy your cluster with the following: kops update cluster --name ${KOPS_CLUSTER_NAME} –yes This command triggers the provisioning of AWS resources defined in your cluster configuration. Validating the Cluster Executing Validation Check Post-creation, ensure your cluster is correctly configured and all instances are operational with the command: “kops validate cluster” This step is vital as it confirms whether your nodes are ready and the Kubernetes control plane is responding accurately. FAQs 1. What Are the Benefits of Using a Multi-Master Setup in Kubernetes? A multi-master setup in Kubernetes, a high-availability (HA) cluster, ensures the cluster's control plane remains accessible and operational, even if one of the master nodes fails. This setup is crucial for production environments where continuous app availability is required. It prevents downtime during maintenance and mitigates the risk of a single point of failure. 2. Can I Use Kops to Create a Single-Master Cluster and Then Convert it to a Multi-Master Setup? While Kops is an incredibly flexible tool, converting a single-master cluster to a multi-master setup isn't its strongest suit as of my last training cut-off in January 2022. Typically, you must create a new cluster with the desired multi-master configuration and migrate your workloads. However, always check the latest Kops documentation or release notes, as new features and capabilities are frequently added. 3. How Does Kops Manage the Underlying Infrastructure for Kubernetes on AWS? Kops automates the provisioning of the necessary infrastructure on AWS to run a Kubernetes cluster. It sets up EC2 instances for your master and worker nodes, configures networking and security groups, and provides other necessary AWS resources like auto-scaling groups, IAM roles, and Route53 records. It effectively abstracts many complexities associated with manually setting up a Kubernetes cluster on AWS. 4. What Happens if a Master Node Fails in a Multi-Master Kubernetes Cluster? In a multi-master setup, if one master node fails, the Kubernetes control plane remains available since the other master nodes continue to serve the cluster. The failed master node can be replaced automatically if you've configured your cluster. Alternatively, you might need to manually intervene to replace the failed node, depending on your specific setup. 5. Are There Any Cost Considerations When Running a Multi-Master Kubernetes Cluster on AWS? Running a multi-master cluster will be more expensive than a single-master cluster because you're utilizing additional EC2 instances and other resources, which can add up over time. However, the benefit of improved resilience and uptime often outweighs the additional cost, especially for production environments. It's important to monitor your AWS resources and costs to ensure they align with your budget and operational needs. Conclusion Setting up a multi-master Kubernetes cluster on AWS using Kops enhances your application’s resilience and ensures uninterrupted availability. Although the process might seem intricate, the high-availability setup is indispensable for production environments. Following this detailed guide, you can deploy a robust, fault-tolerant Kubernetes infrastructure tailor-made for your organizational needs. Remember, the key to a successful Kubernetes setup lies in meticulous configuration, constant monitoring, and timely updates. Welcome to the future of application deployment!
I started research for an article on how to add a honeytrap to a GitHub repo. The idea behind a honeypot weakness is that a hacker will follow through on it and make his/her presence known in the process. My plan was to place a GitHub personal access token in an Ansible vault protected by a weak password. Should an attacker crack the password and use the token to clone the private repository, a webhook should have triggered and mailed a notification that the honeypot repo has been cloned and the password cracked. Unfortunately, GitHub seems not to allow webhooks to be triggered after cloning, as is the case for some of its higher-level actions. This set me thinking that platforms as standalone systems are not designed with Dev(Sec)Ops integration in mind. DevOps engineers have to bite the bullet and always find ways to secure pipelines end-to-end. I, therefore, instead decided to investigate how to prevent code theft using tokens or private keys gained by nefarious means. Prevention Is Better Than Detection It is not best practice to have secret material on hard drives thinking that root-only access is sufficient security. Any system administrator or hacker that is elevated to root can view the secret in the open. They should, rather, be kept inside Hardware Security Modules (HSMs) or a secret manager, at the very least. Furthermore, tokens and private keys should never be passed in as command line arguments since they might be written to a log file. A way to solve this problem is to make use of a super-secret master key to initiate proceedings and finalize using short-lived lesser keys. This is similar to the problem of sharing the first key in applied cryptography. Once the first key has been agreed upon, successive transactions can be secured using session keys. It goes beyond saying that the first key has to be stored in Hardware Security Modules, and all operations against it have to happen inside an HSM. I decided to try out something similar when Ansible clones private Git repositories. Although I will illustrate at the hand of GitHub, I am pretty sure something similar can be set up for other Git platforms as well. First Key GitHub personal access tokens can be used to perform a wide range of actions on your GitHub account and its repositories. It authenticates and authorizes from both the command line and the GitHub API. It clearly can serve as the first key. Personal access tokens are created by clicking your avatar in the top right and selecting Settings: A left nav panel should appear from where you select Developer settings: The menu for personal access tokens will display where you can create the token: I created a classic token and gave it the following scopes/permissions: repo, admin:public_key, user, and admin:gpg_key. Take care to store the token in a reputable secret manager from where it can be copied and pasted when the Ansible play asks for it when it starts. This secret manager should clear the copy buffer after a few seconds to prevent attacks utilizing attention diversion. vars_prompt: - name: github_token prompt: "Enter your github personal access token?" private: true Establishing the Session GitHub deployment keys give access to private repositories. They can be created by an API call or from the repo's top menu by clicking on Settings: With the personal access token as the first key, a deployment key can finish the operation as the session key. Specifically, Ansible authenticates itself using the token, creates the deployment key, authorizes the clone, and deletes it immediately afterward. The code from my previous post relied on adding Git URLs that contain the tokens to the Ansible vault. This has now been improved to use temporary keys as envisioned in this post. An Ansible role provided by Asif Mahmud has been amended for this as can be seen in the usual GitHub repo. The critical snippets are: - name: Add SSH public key to GitHub account ansible.builtin.uri: url: "https://api.{{ git_server_fqdn }/repos/{{ github_account_id }/{{ repo }/keys" validate_certs: yes method: POST force_basic_auth: true body: title: "{{ key_title }" key: "{{ key_content.stdout }" read_only: true body_format: json headers: Accept: application/vnd.github+json X-GitHub-Api-Version: 2022-11-28 Authorization: "Bearer {{ github_access_token }" status_code: - 201 - 422 register: create_result The GitHub API is used to add the deploy key to the private repository. Note the use of the access token typed in at the start of play to authenticate and authorize the request. - name: Clone the repository shell: | GIT_SSH_COMMAND="ssh -i {{ key_path } -v -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null" {{ git_executable } clone git@{{ git_server_fqdn }:{{ github_account_id }/{{ repo }.git {{ clone_dest } - name: Switch branch shell: "{{ git_executable } checkout {{ branch }" args: chdir: "{{ clone_dest }" The repo is cloned, followed by a switch to the required branch. - name: Delete SSH public key ansible.builtin.uri: url: "https://api.{{ git_server_fqdn }/repos/{{ github_account_id }/{{ repo }/keys/{{ create_result.json.id }" validate_certs: yes method: DELETE force_basic_auth: true headers: Accept: application/vnd.github+json X-GitHub-Api-Version: 2022-11-28 Authorization: "Bearer {{ github_access_token }" status_code: - 204 Deletion of the deployment key happens directly after the clone and switch, again via the API. Conclusion The short life of the deployment key enhances the security of the DevOps pipeline tremendously. Only the token has to be kept secured at all times as is the case for any first key. Ideally, you should integrate Ansible with a compatible HSM platform. I thank Asif Mahmud for using their code to illustrate the concept of using temporary session keys when cloning private Git repositories.
When it comes to Java web servers, Apache Tomcat remains a strong favorite. Some of these instances have been containerized over the years, but many still run in the traditional setup of a Linux-based virtual machine or even on bare metal. Red Hat JBoss Web Server (JWS) combines the servlet engine (Apache Tomcat) with the web server (Apache HTTPD), and modules for load balancing (mod_jk and mod_cluster). Ansible is an automation tool that provides a suite of tools for managing an enterprise at scale. In this article, we will illustrate how Ansible can be used to completely automate the deployment of a JBoss Web Server 6 instance on a Red Hat Enterprise Linux 9 server. This automation encompasses the following tasks: Retrieve the archive containing the JBoss Web Server from the Red Hat Customer Portal and install the files on the system. Configure the Red Hat Enterprise Linux (RHEL) operating system including the users, groups, and the required setup files to enable JBoss Web Server as a systemd service. Ensure the required Java Virtual Machine is installed Fine-tune the configuration of the JBoss Web Server server, such as binding it to the appropriate interface and port. Deploy web applications along with enabling and starting the JBoss Web Server as a systemd service. Perform a health check to ensure that the deployed application is accessible. Our Ansible playbook will fully automates all of those operations, so no manual steps will be required. Preparing the Target Environment Prerequisites Before we start with the automation work, we need to specify the target environment. In this case, you'll be using Red Hat Enterprise Linux 9 with Python 3.9. We'll use this configuration on both the Ansible control node (where Ansible is executed), which will be referred to from now on as controller, and the Ansible target (the system being configured). The controller for this demonstration has the following requirements: Shell $ cat /etc/redhat-release Red Hat Enterprise Linux release 9.3 (Plow) Verifying the version of Ansible is pretty straightforward, and it also provides the needed information on the Python version used to run it: Shell $ ansible --version ansible [core 2.14.9] config file = /work/ansible.cfg configured module search path = ['/root/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python3.9/site-packages/ansible ansible collection location = /root/.ansible/collections:/usr/share/ansible/collections executable location = /usr/bin/ansible python version = 3.9.18 (main, Jan 4 2024, 00:00:00) [GCC 11.4.1 20230605 (Red Hat 11.4.1-2)] (/usr/bin/python3) jinja version = 3.1.2 libyaml = True Note: The procedure in this article may not execute successfully if you use a different Python version or target operating system. Installing the Red Hat Ansible Certified Content Collection Once you have Red Hat Enterprise Linux 9 set up and Ansible 2.14 ready to go, you need to install the Red Hat Ansible Certified Content Collection 2.0 for Red Hat JBoss Web Server. To install the Red Hat Certified Collection for JBoss Web Server, you will need to configure Ansible to use Red Hat Automation Hub as the preferred Galaxy server. Follow the instructions on Automation Hub to retrieve your token and update the ansible.cfg configuration on your Ansible controller. Update the <your-token> field with the token obtained from Automation Hub: YAML [galaxy] server_list = automation_hub, galaxy [galaxy_server.galaxy] url=https://galaxy.ansible.com/ [galaxy_server.automation_hub] url=https://cloud.redhat.com/api/automation-hub/api/galaxy/ auth_url=https://sso.redhat.com/auth/realms/redhat-external/protocol/openid-connect/token token=<your-token> If you are not familiar with Ansible, note that this configuration file lives in the same directory as the Ansible playbook we are going to design for our JWS deployment. Once you have configured Ansible to use Automation Hub, install the certified collection: Shell $ ansible-galaxy collection install redhat.jws Starting galaxy collection install process Process install dependency map Starting collection install process Downloading https://console.redhat.com/api/automation-hub/v3/plugin/ansible/content/published/collections/artifacts/redhat-jws-2.0.0.tar.gz to /root/.ansible/tmp/ansible-local-88isxfxlvv/tmpvujtdugq/redhat-jws-2.0.0-zf_lh9ed Installing 'redhat.jws:2.0.0' to '/root/.ansible/collections/ansible_collections/redhat/jws' Downloading https://console.redhat.com/api/automation-hub/v3/plugin/ansible/content/published/collections/artifacts/redhat-runtimes_common-1.1.3.tar.gz to /root/.ansible/tmp/ansible-local-88isxfxlvv/tmpvujtdugq/redhat-runtimes_common-1.1.3-pf34k4r_ redhat.jws:2.0.0 was installed successfully Installing 'redhat.runtimes_common:1.1.3' to '/root/.ansible/collections/ansible_collections/redhat/runtimes_common' Downloading https://console.redhat.com/api/automation-hub/v3/plugin/ansible/content/published/collections/artifacts/ansible-posix-1.5.4.tar.gz to /root/.ansible/tmp/ansible-local-88isxfxlvv/tmpvujtdugq/ansible-posix-1.5.4-yie4utve redhat.runtimes_common:1.1.3 was installed successfully Installing 'ansible.posix:1.5.4' to '/root/.ansible/collections/ansible_collections/ansible/posix' ansible.posix:1.5.4 was installed successfully Ansible Galaxy fetches and downloads the collection's dependencies. These dependencies include the redhat.runtimes_common collection, which helps facilitate the retrieval of the archive containing the JBoss Web Server server from the Red Hat customer portal. Red Hat Customer Portal Credentials For the collection to be able to download the JWS archive from the Red Hat Customer Portal, we need to supply the credentials associated with a Red Hat service account. One way to provide those values parameters is to create a service_account.yml which can be passed to Ansible as an extra source of variables: YAML --- rhn_username: <service_account_id> rhn_password: <service_account_password> Installing the Red Hat JBoss Web Server The configuration steps in this section include downloading JBoss Web Server, installing Java, and enabling JBoss Web Server as a system service (systemd). Configuring the JVM JBoss Web Server is a Java-based server, so the target system must have a Java Virtual Machine (JVM) installed. Although Ansible primitives can perform such tasks natively, the redhat.jws collection can also take care of this task as well provided that the jws_java_version variable is defined. By default, the value is the latest Red Hat supported version of OpenJDK (17). While we will keep the latest version for this demonstration, note that a different version of the OpenJDK can be set using the jws_java_version variable: YAML jws_java_version: 11 Note: This feature works only if the target system's distribution belongs to the Red Hat family. Enabling JBoss Web Server as a System Service (systemd) The JBoss Web Server server on the target system should run as a service system. The collection can also take care of this task if the jws_systemd_enabled variable is defined as True (which is the default value as the target systems are expected to be RHEL machines). Note: This configuration works only when systemd is installed and the system belongs to the Red Hat family. Running the Playbook The Red Hat Ansible Certified Content Collection comes with a playbook that can be used directly to ensure that JWS is properly installed on target instances. Execute the following command to execute playbook included within the collection along with the extra variables file created previously: Shell $ ansible-playbook -i inventory -e @service_account.yml redhat.jws.playbook PLAY [Red Hat JBoss Web Server installation and configuration] ***************** TASK [Gathering Facts] ********************************************************* ok: [localhost] TASK [redhat.jws.jws : Validating arguments against arg spec 'main'] *********** ok: [localhost] TASK [redhat.jws.jws : Check for conflicting Java variables] ******************* skipping: [localhost] TASK [redhat.jws.jws : Set default values] ************************************* skipping: [localhost] TASK [redhat.jws.jws : Check that jws_home has been defined.] ****************** ok: [localhost] => { "changed": false, "msg": "All assertions passed" } TASK [redhat.jws.jws : Add firewalld to dependencies list (if enabled)] ******** skipping: [localhost] TASK [redhat.jws.jws : Add 'openssl' and 'apr' to dependencies list required for natives (if enabled)] *** skipping: [localhost] TASK [redhat.jws.jws : Include tasks for Java installation (if Java version is provided)] *** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/java_install.yml for localhost TASK [redhat.jws.jws : Add 'java-17-openjdk-headless' to dependencies list] **** ok: [localhost] TASK [redhat.jws.jws : Determine JAVA_HOME for selected JVM RPM] *************** ok: [localhost] TASK [redhat.jws.jws : Install required dependencies] ************************** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/fastpackage.yml for localhost TASK [redhat.jws.jws : Check if "zip, unzip, tzdata, sudo, java-17-openjdk-headless" packages are already installed] *** ok: [localhost] TASK [redhat.jws.jws : Add missing packages to the yum install list] *********** ok: [localhost] TASK [redhat.jws.jws : Install packages: ['java-17-openjdk-headless']] ********* changed: [localhost] TASK [redhat.jws.jws : Ensure tomcatjss rpm is not installed] ****************** ok: [localhost] TASK [redhat.jws.jws : Create group: tomcat] *********************************** changed: [localhost] TASK [redhat.jws.jws : Create user: tomcat] ************************************ changed: [localhost] TASK [redhat.jws.jws : Check state of install_dir: /opt] *********************** ok: [localhost] TASK [redhat.jws.jws : Ensure install dir is created: /opt] ******************** skipping: [localhost] TASK [redhat.jws.jws : Set defaults values based on facts (if values not provided)] *** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/defaults.yml for localhost TASK [redhat.jws.jws : Set filename for JWS zipfile] *************************** ok: [localhost] TASK [redhat.jws.jws : Set native zipfile architecture (if not provided)] ****** ok: [localhost] TASK [redhat.jws.jws : Set RHEL major version based on facts (if not provided).] *** ok: [localhost] TASK [redhat.jws.jws : Set filename for JWS native zipfile] ******************** ok: [localhost] TASK [redhat.jws.jws : Ensure patch version is specified when installing offline.] *** skipping: [localhost] TASK [redhat.jws.jws : Ensure credentials are defined when installing from JBossNetwork API.] *** ok: [localhost] TASK [redhat.jws.jws : Check main zipfile] ************************************* skipping: [localhost] TASK [redhat.jws.jws : Check native zipfile exists] **************************** skipping: [localhost] TASK [redhat.jws.jws : Check patch zipfile exists] ***************************** skipping: [localhost] TASK [redhat.jws.jws : Check native patch zipfile exists] ********************** skipping: [localhost] TASK [redhat.jws.jws : Include install tasks] ********************************** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/install.yml for localhost TASK [redhat.jws.jws : Check arguments] **************************************** ok: [localhost] TASK [redhat.jws.jws : Check working directory /work for local repository] ***** ok: [localhost] TASK [redhat.jws.jws : Display install method] ********************************* ok: [localhost] => { "msg": "Install method: zipfiles" } TASK [redhat.jws.jws : Include installation tasks using zipfiles method] ******* included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/install/local.yml for localhost TASK [redhat.jws.jws : Deploy jws-6.0.0-application-server.zip to target.] ***** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/install/deploy_archive.yml for localhost TASK [redhat.jws.jws : Check that required parameters have been provided.] ***** ok: [localhost] TASK [redhat.jws.jws : Check download archive path on target: /opt/jws-6.0.0-application-server.zip] *** ok: [localhost] TASK [redhat.jws.jws : Retrieve zipfiles, if missing, from RHN (if credentials provided)] *** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/install/download_from_rhn.yml for localhost TASK [redhat.jws.jws : Search for product to download using JBoss Network API] *** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/rhn/search.yml for localhost TASK [redhat.jws.jws : Ensure required parameters are provided] **************** ok: [localhost] TASK [redhat.jws.jws : Retrieve product download using JBossNetwork API] ******* ok: [localhost] TASK [redhat.jws.jws : Ensure search results are valid.] *********************** ok: [localhost] TASK [redhat.jws.jws : Determine install zipfile from search results] ********** ok: [localhost] TASK [redhat.jws.jws : Download Red Hat JWS] *********************************** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/rhn/download.yml for localhost TASK [redhat.jws.jws : Ensure required parameters are provided] **************** ok: [localhost] TASK [redhat.jws.jws : Load metadata on target location for download (/work/jws-6.0.0-application-server.zip)] *** ok: [localhost] TASK [redhat.jws.jws : Ensure /work/jws-6.0.0-application-server.zip is accessible] *** ok: [localhost] TASK [redhat.jws.jws : Download Red Hat product into {{ rhn_product_path } (rhn_download_become: {{ rhn_download_become })] *** changed: [localhost] TASK [redhat.jws.jws : Retrieve zipfiles from URL (if provided).] ************** skipping: [localhost] TASK [redhat.jws.jws : Copy archives /work/jws-6.0.0-application-server.zip to target nodes: /opt/jws-6.0.0-application-server.zip] *** changed: [localhost] TASK [redhat.jws.jws : Deploy jws-6.0.0-optional-native-components-RHEL9-x86_64.zip to target.] *** skipping: [localhost] TASK [redhat.jws.jws : Include installation tasks for zip operations] ********** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/install/zipfiles.yml for localhost TASK [redhat.jws.jws : Check arguments] **************************************** ok: [localhost] TASK [redhat.jws.jws : Add zipfile to unarchive list] ************************** ok: [localhost] TASK [redhat.jws.jws : Add native zipfile to unarchive list] ******************* skipping: [localhost] TASK [redhat.jws.jws : Install Jboss Web Server and required binaries from local zipfiles (install method: zipfiles)] *** changed: [localhost] => (item={'src': 'jws-6.0.0-application-server.zip', 'creates': '/opt/jws-6.0/tomcat/bin'}) TASK [redhat.jws.jws : Move the zipfile extracted directory to custom jws_home] *** skipping: [localhost] TASK [redhat.jws.jws : Move the version.txt to custom jws_home] **************** skipping: [localhost] TASK [redhat.jws.jws : Include installation tasks for rpm method] ************** skipping: [localhost] TASK [redhat.jws.jws : Include systemd tasks] ********************************** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/systemd/systemd.yml for localhost TASK [redhat.jws.jws : Check arguments] **************************************** ok: [localhost] TASK [redhat.jws.jws : Ensure requirements for systemd] ************************ included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/fastpackage.yml for localhost TASK [redhat.jws.jws : Check if "systemd, procps-ng" packages are already installed] *** ok: [localhost] TASK [redhat.jws.jws : Add missing packages to the yum install list] *********** ok: [localhost] TASK [redhat.jws.jws : Install packages: ['java-17-openjdk-headless']] ********* ok: [localhost] TASK [redhat.jws.jws : Set required default for jws_service_conf if not provided.] *** ok: [localhost] TASK [redhat.jws.jws : Set required default for jws_service_conf if not provided.] *** ok: [localhost] TASK [redhat.jws.jws : Set required default for jws_service_conf if not provided.] *** ok: [localhost] TASK [redhat.jws.jws : Ensure service script is deployed] ********************** changed: [localhost] TASK [redhat.jws.jws : Ensure service configurations files is deployed: /opt/jws-6.0/tomcat/conf/jws6-tomcat.conf] *** changed: [localhost] TASK [redhat.jws.jws : Ensure systemd service is configured] ******************* changed: [localhost] TASK [redhat.jws.jws : Include patch install tasks] **************************** skipping: [localhost] TASK [redhat.jws.jws : Ensure /opt/jws-6.0/tomcat/ directories have appropriate privileges] *** ok: [localhost] => (item=conf) ok: [localhost] => (item=temp) ok: [localhost] => (item=logs) ok: [localhost] => (item=webapps) ok: [localhost] => (item=bin) TASK [redhat.jws.jws : Ensure /opt/jws-6.0/tomcat/ files have the recommended priviliges, owner and group] *** changed: [localhost] => (item=./conf/catalina.properties) changed: [localhost] => (item=./conf/catalina.policy) changed: [localhost] => (item=./conf/logging.properties) changed: [localhost] => (item=./conf/jaspic-providers.xml) changed: [localhost] => (item=conf/tomcat-users.xml) TASK [redhat.jws.jws : Include ajp sanity check tasks] ************************* skipping: [localhost] TASK [redhat.jws.jws : Include https sanity check tasks] *********************** skipping: [localhost] TASK [redhat.jws.jws : Deploy custom configuration files] ********************** changed: [localhost] => (item={'template': 'templates/6/server.xml.j2', 'dest': '/opt/jws-6.0/tomcat/./conf/server.xml'}) changed: [localhost] => (item={'template': 'templates/6/web.xml.j2', 'dest': '/opt/jws-6.0/tomcat/./conf/web.xml'}) changed: [localhost] => (item={'template': 'templates/6/context.xml.j2', 'dest': '/opt/jws-6.0/tomcat/./conf/context.xml'}) changed: [localhost] => (item={'template': 'templates/6/catalina.properties.j2', 'dest': '/opt/jws-6.0/tomcat/./conf/catalina.properties'}) TASK [redhat.jws.jws : Include selinux configuration tasks] ******************** skipping: [localhost] TASK [redhat.jws.jws : Remove apps] ******************************************** ok: [localhost] => (item=examples) TASK [redhat.jws.jws : Create vault configuration (if enabled)] **************** skipping: [localhost] TASK [redhat.jws.jws : Ensure firewalld, if enabled, allows communication over 8080.] *** skipping: [localhost] RUNNING HANDLER [redhat.jws.jws : Reload Systemd] ****************************** ok: [localhost] RUNNING HANDLER [redhat.jws.jws : Ensure Jboss Web Server runs under systemd] *** included: /root/.ansible/collections/ansible_collections/redhat/jws/roles/jws/tasks/systemd/service.yml for localhost RUNNING HANDLER [redhat.jws.jws : Check arguments] ***************************** ok: [localhost] RUNNING HANDLER [redhat.jws.jws : Enable jws service] ************************** changed: [localhost] RUNNING HANDLER [redhat.jws.jws : Start jws service] *************************** changed: [localhost] RUNNING HANDLER [redhat.jws.jws : Restart Jboss Web Server service] ************ changed: [localhost] PLAY RECAP ********************************************************************* localhost : ok=66 changed=14 unreachable=0 failed=0 skipped=22 rescued=0 ignored=0 As you can see, quite a lot happened during this execution. Indeed, the redhat.jws role took care of the entire setup of JWS on the target system. Deploying a Web Application Now that JBoss Web Server is running, we will go a bit further by deploying a web application and ensuring it is running. As we’ll need to write our own playbook, the first step will be to copy the one included playbook within the collection and use it as a base: Shell $ cp ~/.ansible/collections/ansible_collections/redhat/jws/playbooks/playbook.yml . $ cat playbook.yml --- - name: "Red Hat JBoss Web Server installation and configuration" hosts: all become: True vars_files: - vars.yml roles: - redhat.jws.jws $ cp ~/.ansible/collections/ansible_collections/redhat/jws/playbooks/vars.yml . $ cat vars.yml --- jws_setup: true jws_java_version: 17 jws_listen_http_bind_address: 127.0.0.1 jws_systemd_enabled: True jws_service_systemd_type: forking jws_selinux_enabled: False Now, we’ll add a tasks section to the playbook. This section of the playbook will be run after the roles have executed successfully. So, we know that JWS will be operational on the targets at this point. We will include the following tasks to perform the deployment of a web application: YAML --- - name: "Red Hat JBoss Web Server installation and configuration" hosts: all become: True vars_files: - vars.yml roles: - redhat.jws.jws tasks: - name: "Deploy webapp" ansible.builtin.get_url: url: "https://drive.google.com/uc?export=download&id=1w9ss5okctnjUvRAxhPEPyC7DmbUwmbhb" dest: "{{ jws_home }/webapps/info.war" Let’s run the playbook again: Shell $ ansible-playbook -i inventory -e @service_account.yml playbook.yml PLAY [Red Hat JBoss Web Server installation and configuration] ************************************************************************* TASK [Gathering Facts] ***************************************************************************************************************** ok: [localhost] … TASK [Deploy webapp] ******************************************************************************************************************* changed: [localhost] RUNNING HANDLER [redhat.jws.jws : Restart Jboss Web Server service] ******************************************************************** changed: [localhost] PLAY RECAP ***************************************************************************************************************************** localhost : ok=3 changed=2 unreachable=0 failed=0 skipped=0 rescued=0 ignored=0 To be thorough, we will add a post_tasks: section to verify that the web application has been successfully and that the associated service is now available: YAML post_tasks: - name: " Checks that /info is accessible" ansible.builtin.uri: url: "http://localhost:8080/info" status_code: 200 return_content: no The Benefits of Such Automation In short, automation saves time and reduces the risk of error inherent to any human manipulation. The Red Hat Ansible Certified Content Collection encapsulates (as much as possible) the complexities and the inner workings of Red Hat JBoss Web Server deployment. With the help of the certified Ansible collection, you can focus on your business use case, such as deploying applications, instead of establishing the underlying application server. The result is reduced complexity and faster time to value. The automated process is also repeatable and can be used to set up as many systems as needed.
Implementing Continuous Integration/Continuous Deployment (CI/CD) for a Python application using Django involves several steps to automate testing and deployment processes. This guide will walk you through setting up a basic CI/CD pipeline using GitHub Actions, a popular CI/CD tool that integrates seamlessly with GitHub repositories. Step 1: Setting up Your Django Project Ensure your Django project is in a Git repository hosted on GitHub. This repository will be the basis for setting up your CI/CD pipeline. Step 2: Creating a Virtual Environment and Dependencies File 1. Virtual Environment It's good practice to use a virtual environment for your Django project to manage dependencies. PowerShell python3 -m venv venv source venv/bin/activate # On Windows use `venv\Scripts\activate` 2. Dependencies File Create a `requirements.txt` file in your project root that lists all your project's dependencies, including Django itself. PowerShell pip freeze > requirements.txt Step 3: Configuring GitHub Actions for Continuous Integration 1. Create a Workflow File In your repository, create a directory and file for your GitHub Actions workflow: `.github/workflows/django-ci.yml`. 2. Define the Workflow Populate `django-ci.yml` with the necessary steps to install dependencies, run tests, and any other checks you want as part of your CI process. Here's a simple example: Here's a breakdown of each section in the `django-ci.yml` file: YAML name: Django CI on: [push, pull_request] name: This is a descriptive name for your workflow. It will appear in the GitHub Actions section of your repository. on: This key specifies the events that will trigger the workflow. In this case, the workflow runs on `push` events to any branch and on `pull_request` events. YAML jobs: build: runs-on: ubuntu-latest jobs: Jobs are a set of steps that execute on the same runner. Here, we have a job named `build`. build: This is the identifier for the job. You can name it anything, but `build` is descriptive of its purpose. runs-on: Specifies the type of machine to run the job on. `ubuntu-latest` means the job will run on the latest version of Ubuntu Linux. YAML steps: - uses: actions/checkout@v2 - name: Set up Python 3.x uses: actions/setup-python@v2 with: python-version: 3.x steps: Steps are individual tasks that run commands in the job. Each step can either run a script or an action. uses: actions/checkout@v2: This step uses the `checkout` action to check out your repository under `$GITHUB_WORKSPACE`, so your workflow can access it. uses: actions/setup-python@v2: This action sets up a Python environment for the job, specifying that we want to use Python 3.x. YAML - name: Install Dependencies run: | python -m pip install --upgrade pip pip install -r requirements.txt name: Provides a descriptive name for the step. run: Executes command-line programs. Here, it's used to upgrade `pip` and install the dependencies listed in `requirements.txt`. YAML - name: Run Django Tests run: | python manage.py test Run Django Tests: This step runs the Django test suite by executing `manage.py test`. This is where your Django application's unit tests are executed, ensuring that your codebase works as expected. Step 4: Configuring Continuous Deployment Continuous Deployment can be configured to automatically deploy your Django application to a hosting service like Heroku, AWS, or any other provider you choose. For this example, we'll use Heroku. Preparing Your Django Project for Heroku Deployment Before integrating the deployment process into your CI/CD pipeline, ensure your Django project is prepared for Heroku deployment. This preparation includes: 1. Procfile Create a `Procfile` in the root directory of your Django project. This file tells Heroku how to run your application. For a typical Django app, the `Procfile` might look like this: Plain Text web: gunicorn myproject.wsgi --log-file - Replace `myproject` with the name of your Django project. 2. Runtime Specification If your application requires a specific Python version, specify this version in a `runtime.txt` file in your project's root directory, like so: Plain Text python-3.9.1 Note: Any Python version python-3.x 3. Database Configuration Make sure your `settings.py` is configured to use Heroku's database when deployed. Heroku sets an environment variable called `DATABASE_URL` for the database, which you can use with `dj-database-url`: Plain Text import dj_database_url DATABASES['default'] = dj_database_url.config(conn_max_age=600, ssl_require=True) Extending the GitHub Actions Workflow for Deployment After preparing your Django project for Heroku deployment, extend your `.github/workflows/django-ci.yml` file to include deployment steps. Here's how you can do it: YAML - name: Install Heroku CLI run: | curl https://cli-assets.heroku.com/install.sh | sh - name: Deploy to Heroku if: github.ref == 'refs/heads/main' run: | heroku git:remote -a ${{ secrets.HEROKU_APP_NAME } git push heroku HEAD:master -f env: HEROKU_API_KEY: ${{ secrets.HEROKU_API_KEY } Breakdown 1. Install Heroku CLI This step installs the Heroku Command Line Interface on the runner, which allows you to interact with Heroku from the command line. 2. Deploy to Heroku Conditional Execution: The `if: github.ref == 'refs/heads/main'` condition ensures that the deployment only occurs when changes are pushed to the `main` branch. Heroku Remote Setup:`heroku git:remote -a ${{ secrets.HEROKU_APP_NAME }` sets the Heroku app as a remote for git. Replace `HEROKU_APP_NAME` with the name of your Heroku app. Deployment: `git push heroku HEAD:master -f` pushes the code to Heroku, triggering a deployment. The `-f` flag forces the push, which might be necessary if you're overwriting history. Managing Secrets for Heroku Deployment For the deployment steps to work, you need to configure the`HEROKU_API_KEY` and `HEROKU_APP_NAME` as secrets in your GitHub repository: Go to your repository on GitHub, click on "Settings" > "Secrets". Click on "New repository secret." Add `HEROKU_API_KEY` as the name and your Heroku API key as the value. Repeat the process for `HEROKU_APP_NAME`, adding your Heroku app's name as the value. Step 5: Managing Secrets GitHub Secrets Navigate to your repository settings on GitHub, find the "Secrets" section, and add your Heroku API key (`HEROKU_API_KEY`) and app name (`HEROKU_APP_NAME`) as secrets. Conclusion This guide outlines the basic steps to set up a CI/CD pipeline for a Django application using GitHub Actions, from automated testing to deployment on Heroku. The CI process ensures your code is automatically tested, while the CD process enables seamless deployment to your hosting platform, ensuring your application is always up-to-date with the latest changes in your codebase. Cheers, Happy Coding!!
Understanding how to organize a pipeline from development to operation has, in my experience, proven to be quite the endeavor. This tutorial aims to tackle precisely this challenge by guiding you through the required tools necessary to deploy your code as Docker containers by going through the steps involved in creating a simple "Hello, World!" application (although preexisting projects are also easily applicable for this approach). Whether you're a seasoned developer seeking to optimize your workflow or a newcomer eager to learn best practices, this tutorial will equip you with the knowledge and tools to streamline your development process effectively. Moreover, becoming proficient in this pipeline setup will greatly enhance your workflow, allowing you to deliver high-quality software faster, with fewer errors, and ultimately, better meet the demands of today's agile development environments. If you have come far enough to consider a pipeline for your project, I expect you to be familiar with some of the simpler tools involved in this process (e.g. Git, Java, Maven), and will not cover these in-depth. You may also enjoy: Building CI/CD Pipelines for Java Using Azure DevOps (Formerly VSTS) To go about making a pipeline for our "Hello, World!" application, the following subjects will briefly be covered: Azure DevOps Azure Repos Maven Git Azure Pipelines Docker To make things clear: Our goal is to be able to run docker run <dockerid>/<image>:<tag> while before, only having run git push on master. This is an attempt to create a foundation for future CI/CD implementations ultimately leading to a DevOps environment. Azure DevOps One of the prerequisites for this walk-through is to use the Azure DevOps platform. I can highly encourage the full package, but the modules Repos and Pipelines are the only ones required. So, if you have not already, you should sign yourself up and create a project. After doing so, we can proceed to the Repos module. Azure Repos This module provides some simple tools for maintaining a repository for your code. While a repository could easily be managed by something like GitHub, this module supports solid synergy between repositories and pipelines. After you click on the module, you will be met with the usual Git preface for setting up a repository. I highly recommend using the SSH methods for long-term usage (if this is unknown to you, see Connect to your Git repos with SSH). Now, after setting it up, you will be able to clone the repository onto your computer. Continuing, we will create a Maven project within the repository folder using IntelliJ IDEA (other IDEs can be used, but I will only cover IntelliJ) that ultimately prints the famous sentence, "Hello, World!" (for setting up a project with Maven, see Creating a new Maven project - IntelliJ). This should leave you with a project tree like so: Hello World project tree Finish off by creating a main class in src/main/java: Java x 1 public class Main { 2 public static void main(String[] args) { 3 System.out.println("Hello World!"); 4 } 5 } But before pushing these changes to master, a few things need to be addressed. Maven Maven provides developers with a powerful software management tool configurable from one location, the pom.xml file. Looking at the generated pom file in our project, we will see the following: XML xxxxxxxxxx 1 10 1 <?xml version="1.0" encoding="UTF-8"?> 2 <project xmlns="http://maven.apache.org/POM/4.0.0" 3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 4 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> 5 <modelVersion>4.0.0</modelVersion> 6 7 <groupId>surgo.testing</groupId> 8 <artifactId>testing-helloworld</artifactId> 9 <version>1.0</version> 10 </project> In our case, the only really interesting part of the pom file is the version tag. The reason is that upon pushing our source code to master, Maven will require a new version each time — enforcing good practice. As an extension, we need to make Maven create an executable .jar file with a manifest of where the main class is to be located. Luckily, we can just use their own Maven plugin: XML xxxxxxxxxx 1 33 1 <?xml version="1.0" encoding="UTF-8"?> 2 <project xmlns="http://maven.apache.org/POM/4.0.0" 3 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 4 xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"> 5 <modelVersion>4.0.0</modelVersion> 6 7 <groupId>surgo.testing</groupId> 8 <artifactId>testing-helloworld</artifactId> 9 <version>1.0</version> 10 11 <properties> 12 <main.class>Main</main.class> 13 </properties> 14 15 <build> 16 <plugins> 17 <plugin> 18 <groupId>org.apache.maven.plugins</groupId> 19 <artifactId>maven-jar-plugin</artifactId> 20 <version>3.1.2</version> 21 <configuration> 22 <archive> 23 <manifest> 24 <addClasspath>true</addClasspath> 25 <classpathPrefix>lib/</classpathPrefix> 26 <mainClass>${main.class}</mainClass> 27 </manifest> 28 </archive> 29 </configuration> 30 </plugin> 31 </plugins> 32 </build> 33 </project> The only thing you might want to change is the name of the main class (line 12). Remember the package name, if not directly located in src/main/java (I prefer using properties, but you can insert the name directly in line 26 if you like). Lastly, before committing our additions to master, we will need to build the target folder which includes our .jar file. This can be done either directly through IntelliJ or in the terminal (if you have Maven installed). Simply press the lifecycle "package" in the UI, or run mvn package in the terminal. Upon finalization, a .jar file will have appeared in the target folder: This concludes the initial setup necessary for our pipeline and we can now finally push our changes to master. Git Most of you are probably quite familiar with Git, but I will go ahead and cover what needs to be done anyway. The Git tool provides us with a distributed version control system easily accessible from anywhere. Now provided we correctly configured our repository in Azure Repos, cloned it to our local computer, and initialized the IntelliJ project within that folder, it should be straightforward. As all of our added files have yet to be staged, run git add. This will stage every changed or added file. Then run git commit -m "initial commit" to commit the staged files. Lastly, run git push to push the committed files to master. You might now be wondering, "Has all the magic happened?" And the answer would be no. In fact, not much has happened. We have created a repository and filled it with a Maven project that prints "Hello, World!" when invoked, which in all honesty, is not much of an achievement. But, more importantly, we have established a foundation for our pipeline. Azure Pipelines Pipelines, the star of the show, provides us with build and deployment automation. It enables us to customize what should happen whenever a build is triggered (in our case by pushing to master). Let me take you through the process of setting up a simple pipeline. Step 1: First, go to the Azure DevOps Pipeline module. This will present you with a single button "Create Pipeline," press it. Step 2: We will now be prompted for the location of our code, and since we used Azure Repos, press "Azure Repos Git." Step 3: It will now look through your repositories. Press the one you pushed the Maven project onto. Step 4: Since it is a Maven project, select "Maven."You should now be presented with the following azure.pipelines.yml file: HXML xxxxxxxxxx 1 22 1 # Maven 2 # Build your Java project and run tests with Apache Maven. 3 # Add steps that analyze code, save build artifacts, deploy, and more: 4 # https://docs.microsoft.com/azure/devops/pipelines/languages/java 5 6 trigger: 7 - master 8 9 pool: 10 vmImage: 'ubuntu-latest' 11 12 steps: 13 - task: Maven@3 14 inputs: 15 mavenPomFile: 'pom.xml' 16 mavenOptions: '-Xmx3072m' 17 javaHomeOption: 'JDKVersion' 18 jdkVersionOption: '1.8' 19 jdkArchitectureOption: 'x64' 20 publishJUnitResults: true 21 testResultsFiles: '**/surefire-reports/TEST-*.xml' 22 goals: 'package' Do not think too much about the semantics of the file. The important thing to know now is that the trigger is set to master and the steps include a task for Maven. For more information about the Maven inputs, see Maven task. Step 5: If everything looks in order, press "save and run" in the top-right corner to add the azure.pipelines.yml file to the repository. The pipeline will then be activated and run its first job. Docker Docker, the final piece of the puzzle, provides us with an OS-level of virtualization in the shape of containers with lots of versatility and opportunity. We need this tool to deploy our builds on machines and luckily, it is greatly integrated into the Azure DevOps platform. To fully utilize its many capabilities, you will need to register on the DockerHub. Step 1: After registration, create a repository with the name of your application. Then choose whether or not to make it public (you can only have one private repository with the free plan). Step 2: Next, we need to authorize DockerHub into our Azure DevOps project. To do this go back to Azure DevOps and click on 'Project Settings' in the bottom-left corner. Step 3: Choose "Pipelines/Service Connections*". Step 4: Now click on the top-right button "New service connection" and search for Docker registry. Mark it, and hit next. Step 5: Choose "Docker Hub" as the registry type. Step 6: Fill in the remaining fields (the service connection name is up to you). You should now be able to see your entry below "Service Connections." The connection will make itself relevant later, but for now, we need to go back to the project and add a few things. Since we added the azure.pipelines.yml file to the repository, a git pull needs to be called to pull the newest changes. Furthermore, we need to define our Docker image using a Dockerfile. Step 7: Create a new file in the root of the project and name it "Dockerfile." Your project tree should now look something like this: Project Tree with addition of Dockerfile The Dockerfile should be considered a template for containers much like classes are for objects. What needs to be defined in this template is as follows: We need to set a basis for the virtual environment (FROM openjdk:8). We need to copy our .jar file onto the virtual environment (COPY /target/testing-helloworld-?.?*.jar .). We need to run the .jar file upon initialization (CMD java -jar testing-helloworld-?-?*.jar). You should now have a file looking similar to this: Dockerfile x 1 FROM openjdk:8 2 COPY /target/testing-helloworld-?.?*.jar . 3 CMD java -jar testing-helloworld-?.?*.jar The regex simply accounts for different versions being deployed, but the actual name has to match the .jar file from the target folder. Update the azure.pipelines.yml File To sum up our current progress, we have now made a Maven project, linked it to a pipeline, and created a template for the virtual environment. The only thing missing is to connect everything via the azure.pipelines.yml file. Step 1: Add Variables We will need to add some variables for the DockerHub connection, as well as the ever-changing version number to the azure.pipelines.yml file (insert your Service Connection and Docker repository): HXML xxxxxxxxxx 1 1 ... 2 variables: 3 containerRegistryServiceConnection: saban17-testing 4 imageRepository: saban17/testing-helloworld 5 tag: 1.0.0 6 ... These variables are not strictly necessary, but it never hurts to follow the DRY principle. Step 2: Add Tasks to Pipeline Steps Next, we need to add more tasks to our pipeline steps. What needs to happen is to log in to Docker, build the Dockerfile previously defined, and push the image to our DockerHub repository. One at a time, we add the wanted behavior starting with the Docker login: HXML xxxxxxxxxx 1 1 - task: Docker@2 2 displayName: dockerLogin 3 inputs: 4 command: login 5 containerRegistry: $(containerRegistryServiceConnection) Then the Docker build: HXML xxxxxxxxxx 1 1 - task: Docker@2 2 displayName: dockerBuild 3 inputs: 4 repository: $(imageRepository) 5 command: build 6 Dockerfile: Dockerfile 7 tags: | 8 $(tag) And lastly, the Docker push: HXML xxxxxxxxxx 1 1 - task: Docker@2 2 displayName: dockerPush 3 inputs: 4 command: push 5 containerRegistry: $(containerRegistryServiceConnection) 6 repository: $(imageRepository) 7 tags: | 8 $(tag) You should now have an azure.pipelines.yml file looking similar to this (with the addition of mavenAuthenticateFeed:true in Maven@3 inputs): HXML xxxxxxxxxx 1 48 1 trigger: 2 - master 3 4 pool: 5 vmImage: 'ubuntu-latest' 6 7 variables: 8 containerRegistryServiceConnection: saban17-testing 9 imageRepository: saban17/testing-helloworld 10 tag: 1.0.0 11 12 steps: 13 - task: Maven@3 14 inputs: 15 mavenPomFile: 'pom.xml' 16 mavenOptions: '-Xmx3072m' 17 javaHomeOption: 'JDKVersion' 18 jdkVersionOption: '1.8' 19 jdkArchitectureOption: 'x64' 20 publishJUnitResults: true 21 mavenAuthenticateFeed: true 22 testResultsFiles: '**/surefire-reports/TEST-*.xml' 23 goals: 'package' 24 25 - task: Docker@2 26 displayName: dockerLogin 27 inputs: 28 command: login 29 containerRegistry: $(containerRegistryServiceConnection) 30 31 - task: Docker@2 32 displayName: dockerBuild 33 inputs: 34 repository: $(imageRepository) 35 command: build 36 Dockerfile: Dockerfile 37 tags: | 38 $(tag) 39 40 - task: Docker@2 41 displayName: dockerPush 42 inputs: 43 command: push 44 containerRegistry: $(containerRegistryServiceConnection) 45 repository: $(imageRepository) 46 tags: | 47 $(tag) 48 Understandingly, this might be a little overwhelming; but fear not, it looks more complicated than it really is. For more information about these inputs see Docker task. Push to the Pipeline Finally, now we get to see the magic happen. However, before doing so, I need to tell you the routine procedure that is to push to the pipeline: Step 1: Go into the pom.xml and the azure.pipelines.yml files, and increment the version number. Step 2: Run the Maven lifecycle clean to remove earlier .jar files in the target folder. Step 3: Run the Maven lifecycle package to build and package your code (creating the new .jar file). Step 3: Provided you are on the master branch, run the Git commands: git add . git commit -m "commit message" git push Step 4: Check whether or not the job passes in the pipeline. If everything went as it should, you have now uploaded an image with your .jar file to the associated Docker Hub repository. Step 5: Running this image now only requires the host to have Docker installed. Let us try it! Running Docker Hub repository The input (a) initiates a container from the requested repository. The image was then retrieved, instantiated, and processed with the final result (b) displaying "Hello World!" This concludes the guide for setting up your Java Pipeline with Azure DevOps and Docker. Conclusion By now, it should hopefully be clear why this approach has its benefits. It enables the developer to form a run-time environment (Dockerfile) and upload it to operation with simple to no effort (git push). While it has not been covered, this approach also creates artifacts in Azure DevOps, which is very useful when using something like Maven, as it makes dependencies surprisingly easy to manage. Since this approach only recently made it into our team, it is still under development and a lot of additions are still to be made. I highly encourage you to further expand upon your pipeline by making it fit your exact needs. I hope this guide has proven to be useful as well as practical, and should you have any further questions, feel free to comment.
John Vester
Staff Engineer,
Marqeta
Raghava Dittakavi
Manager , Release Engineering & DevOps,
TraceLink