Owing to an increasingly complex live platform in their own data centers, both development processes and test environments moved ever further away from production reality. It was very difficult to estimate the exact behavior of software and infrastructure changes, let alone to test under simulated real conditions so that uninterrupted operation could be guaranteed under all circumstances.
The goal was to provide an internal, highly flexible cloud platform for development departments, which would allow the depiction of a reliable CI-CD pipeline and the on-demand testing of future infrastructure changes. In addition, developers and DevOps engineers should also be able to provision needed resources at a self-service portal.
In addition, the underlying storage system should be S3 compatible, block-storage-ready, and freely scalable to provide resources for future projects and, if necessary, serve as a company-wide backup system.
We chose a four-node ESXi cluster as hypervisor, which covers its need for virtual machine block storage from a Ceph cluster connected via iSCSI-Multipath. In order to remain operational in the event of any problems with the storage system, the local disks of the hypervisor clusters were merged with EMC ScaleIO to form another horizontally scalable storage tier containing system-critical cluster and storage management machines.
In order to provide space-saving fast S3-compatible storage, erasure-coded pools were configured in the Ceph storage.
For development, a Jenkins-based portal was provided that makes individually preconfigured machines, coreOS and Kubernetes clusters provisionable as required.
ESXi, Ceph, ScaleIO, SLES, coreOS, Kubernetes, Docker, Jenkins, Ansible
The internal cloud platform has established itself as a reliable and workable development platform that facilitates both day-to-day processes, evaluates new technologies and serves for prototyping. It can also test more complex migration scenarios.
The merger of two medium-sized companies from the industrial services industry led to increased complexity of the IT and process landscape. The technological differences prevented the company from achieving optimal time-to-market speed. The change in organizational structures and decentralization of resources failed to achieve the desired effect.
Consolidation and simplification of the IT landscape and optimization of the entire IT value chain to ensure the most efficient and error-free development and IT service operation. The newly developed IT approach should provide a common basis for all future projects.
- Infrastructure: VMWare, Windows, Redhat, Postgres, Hadoop, MS SQL, Puppet, Elastic Stack, Docker
- Languages: Go, Java, .NET
- Monitoring: Solarwinds, PRTG
- CI/CD: Jenkins, Gitlab
- Collaboration: Jira, Confluence
Standards and Frameworks
Size of project
- > 50 project members: data center operations, development, IT service, application management
- > 500 virtual machines
- > 100 developers
- DevOps / cultural change management
- System architecture
- Service architecture
- Application architecture
- Project management
- Design, setup and operation of the virtual infrastructure, coordination with data center operation
- Development of micro services application architecture
- Setup of release management and continuous integration / deployment
- Construction of logging and monitoring platforms
- Introduction of incident management, SLA / KPIs
- ISO27001 certification