Cloud Architecture Notes

Practical notes on Oracle Cloud Infrastructure, Kubernetes, serverless systems, automation, and Generative AI. I write from hands-on experiments, benchmarks, and field patterns.

OCI GenAI Python Starters

OCI GenAI Python Starters: a few small demos to get started

Every time I want to try something new on OCI Generative AI, I end up needing the same thing: one tiny Python example that actually works. Not a full framework. Not a polished app. Just a small script with the right config, the right endpoint, and one clear path from “I have credentials” to “okay, it runs.” After rebuilding those snippets a few times, I put them in one repo: OCI GenAI Python Starters . ...

April 27, 2026 · 2 min · Enrico Pesce
OCI GenAI Catalog model comparison

OCI GenAI Catalog: Pick the Right Model

OCI Generative AI has grown fast—Cohere, Google, Meta, OpenAI, xAI—all available, each with multiple variants. Every time I started a new project I had to dig through documentation to find the right model. So I built OCI GenAI Catalog : a reference guide covering 30+ models with a guided selection wizard. What’s inside 24 chat models from 5 providers, with specs: context window, multimodal, tool use, reasoning, fine-tuning support 9 embedding models and 1 reranking model for RAG pipelines A model selection wizard — filter by task, performance tier, and context needs to get a top-3 recommendation Provider summary Provider Models Strength Cohere 5 RAG, fine-tuning Google Gemini 3 Multimodal, long context (up to 1M tokens) Meta Llama 5 Open weights, MoE efficiency OpenAI gpt-oss 2 Reasoning, agents xAI Grok 6 2M context, code specialization Data is sourced from OCI official docs and kept up to date. Check it out at OCI GenAI Catalog .

March 9, 2026 · 1 min · Enrico Pesce
Ampere CPU inference benchmark

Generative AI: Efficient Inference on Cloud CPUs

It’s been a while since I last wrote here. Lately, I’ve been diving deep into AI inference—the process of running AI models to generate responses—specifically exploring whether we truly need expensive GPUs for running modern language models. Spoiler alert: the answer might surprise you. After extensive testing on Oracle Cloud Infrastructure (OCI), comparing ARM-based Ampere processors against the latest AMD EPYC chips, I discovered that the right combination of software optimizations and compressed models can deliver remarkable performance—all without a single GPU. ...

February 4, 2026 · 5 min · Enrico Pesce
Deploy Oracle Kubernetes Engine Clusters in Minutes

Deploy Oracle Kubernetes Engine Clusters in Minutes

In today’s cloud-native landscape, Kubernetes has become the de facto standard for container orchestration. However, setting up a production-ready Kubernetes cluster can still be a complex and time-consuming process, especially for those new to the ecosystem. Enter OKED (Oracle Kubernetes Engine Deploy), an elegant solution that streamlines the deployment of Kubernetes clusters on Oracle Cloud Infrastructure. What is OKED? OKED is an automated tool that simplifies the deployment of Oracle Kubernetes Engine (OKE) clusters. Built on the Pulumi framework, it enables users to have a fully functional Kubernetes environment up and running in minutes—without requiring extensive expertise in either OCI or Kubernetes. ...

April 25, 2025 · 3 min · Enrico Pesce
Building a Modern Translation Service with Oracle Cloud Infrastructure's Generative AI

Building a Modern Translation Service with Oracle Cloud Infrastructure's Generative AI

The Challenge with Modern Translation Traditional translation services often struggle with context, idioms, and the subtle nuances that make language beautiful and meaningful. As businesses become increasingly global, there’s a growing need for translation services that can handle these complexities while maintaining security, scalability, and cost-effectiveness. Enter OCI Generative AI Oracle Cloud Infrastructure’s Generative AI service offers a compelling solution to these challenges. Unlike conventional translation APIs, OCI’s service leverages advanced language models that understand context and cultural nuances, making it an ideal choice for enterprise applications. ...

December 18, 2024 · 3 min · Enrico Pesce
Stop OCI VM instance

Stop OCI VM instance

Introduction When running workloads on Oracle Cloud Infrastructure (OCI), managing your instances efficiently is crucial. This guide will show you how to create a simple, efficient way to shut down your OCI instance from within the instance itself, using instance metadata and instance principal authentication. Prerequisites An OCI instance with instance principal authentication enabled OCI CLI installed on your instance Basic understanding of command-line operations Understanding the Components Instance Metadata Every OCI instance has access to its own metadata through a local endpoint. This metadata includes crucial information like: ...

December 9, 2024 · 2 min · Enrico Pesce
OCI Vision workflow

OCI Vision: Identify Cereals with a Custom Model

OCI Vision: How do we identify cereals? Is it possible to use OCI Vision to classify images that are not included in the default Vision model, without managing infrastructure or needing deep ML expertise? Yes, it is possible! You can use OCI Vision to identify image content and use this feature to improve your software and business. Let me show you how! In this example, I used cereals, but the same approach can be extended to many visible objects. ...

September 21, 2024 · 4 min · Enrico Pesce
Get a Kubernetes cluster in minutes on OCI

Get a Kubernetes Cluster in Minutes on OCI

My first OCI automation project After 10 years of experience with DevOps practices, automation, infrastructure as code, and many customer discussions, I decided to build a tool that helps people deploy a well-defined Kubernetes architecture without needing deep infrastructure expertise. Oracle Kubernetes Engine Deploy project (OKED) OKED helps you deploy a complete Kubernetes infrastructure on OCI, including the required network dependencies, without requiring OCI expertise. The main requirements that motivated me to develop this project are as follows: ...

September 15, 2024 · 2 min · Enrico Pesce
Intel x86 vs. ARM Architecture: A Comparative Analysis for Server Technologies

Intel x86 vs. ARM Architecture: A Comparative Analysis for Server Technologies

Intel x86 vs. ARM Architecture: A Comparative Analysis for Server Technologies In the dynamic field of server technology, the battle between CPU architectures has become a focal point, especially between Intel’s x86 architecture and ARM-based processors. This blog post provides a comprehensive comparison of these two architectures, focusing specifically on their performance metrics, energy efficiency, and their overarching impact on the server and cloud computing sectors. Detailed Architectural Insights Intel x86 Architecture Intel’s x86 architecture, grounded in a complex instruction set computing (CISC) framework, offers a broad instruction range and high computation capabilities. This architecture is renowned for its robust performance and extensive software compatibility, making it the preferred choice in traditional server environments. ...

April 15, 2024 · 3 min · Enrico Pesce
Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

Comparing CPU Multicore Performance of OCI Compute Standard Flex Shapes

When selecting a compute instance, factors such as raw computational power, price-to-performance ratio, and workload optimization play a significant role. The following standard flex shapes available in most OCI regions are: VM.Standard.E4.Flex (Processor: AMD EPYC 7J13. Base frequency 2.55 GHz, max boost frequency 3.5 GHz) VM.Standard.E5.Flex (Processor: AMD EPYC 9J14. Base frequency 2.4 GHz, max boost frequency 3.7 GHz) VM.Standard3.Flex (Processor: Intel Xeon Platinum 8358. Base frequency 2.6 GHz, max turbo frequency 3.4 GHz) VM.Optimized3.Flex (Processor: Intel Xeon 6354. Base frequency 3.0 GHz, max turbo frequency 3.6 GHz) VM.Standard.A1.Flex (Each OCPU corresponds to a single hardware execution thread. Processor: Ampere Altra Q80-30. Max frequency 3.0 GHz.) In this article: Performance testing with PHP and OCI Compute instances I have tested a single PHP thread/cpu execution over all OCI standard flex shapes, now I conducted multicore benchmark tests with Geekbench 6 using 2, 4 and 8 CPU. ...

March 19, 2024 · 3 min · Enrico Pesce