SimplePie: Demo

Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman

Technology insight for the enterprise

Understand Python’s new lock file format 1 Apr 2025, 7:07 pm

Python Enhancement Proposal (PEP) 751 gives Python a new file format for specifying dependencies. This file, called a lock file, promises to allow developers to reproduce the installation of their Python projects, with the exact same sets of dependencies, from system to system.

Although there are several community solutions, Python has historically lacked an official way to do file locking and conflict resolution for dependencies. The closest thing to a native solution is the requirements.txt file, or the output from pip freeze. The requirements.txt file lists requirements for a project, and possible ranges of versions for each requirement. But it doesn’t address where those requirements should come from, how to validate them against a hash, or many other needs lock files commonly address today.

Individual project management tools like Poetry or uv do their best to resolve conflicts between dependencies and record them in lock files. The bad news: Those lock files have no common format, so they’re not interchangeable between tools. They’re typically created in place by each user when installing a project’s requirements, rather than created by the developer and redistributed.

What is a lock file and why it matters

Lock files allow the dependencies of a project to be reproduced reliably, no matter where the project may be set up. Ideally, the lock file lists each dependency, where to find it, a hash to verify it, and any other information someone might need to recreate the dependency set.

Python projects typically don’t have the massive, sprawling dependency sets that can appear in, say, JavaScript projects. But even projects with only a few dependencies can have conflicts.

Imagine you have a dependency named fixthis, which in turn depends on a package named fixerupper. However, fixthis depends specifically on version 1.5 of fixerupper, because later versions have behaviors that fixthis hasn’t addressed yet.

That by itself isn’t a problem, because you can just install fixthis and have version 1.5 of fixerupper installed automatically. The problem comes when other dependencies in your project might also use fixerupper.

Let’s say we add another dependency to our project, fixthat, which also depends on fixerupper. However, fixthat only works with version 2.0 or higher of fixerupper. That means fixthis and fixthat can’t coexist.

On the other hand, let’s say we had an alternative package, fixit, that could work with version 1.5 of fixerupper or higher. Then both fixthis and fixit could work with version 1.5 of fixerupper. Our project’s lock file could record the way this conflict was resolved, allowing other users to install the exact same dependencies and versions needed to avoid any deadlocks or conflicts.

What PEP 751 does for Python

PEP 751, finalized after much back-and-forth since July 2024, describes a common lock file format for Python projects to address many of the concerns described above.

The file, typically named pylock.toml, uses the TOML data format, just like the standard Python project description file pyproject.toml. A pylock.toml file can be written by hand, but the idea is that you should not have to do so. Existing tools will in time generate pylock.toml automatically, in the same way a requirements.txt file can be generated with pip freeze from a project’s virtual environment.

The pylock.toml file allows for a wide range of details. It can specify which version of the lock file standard to use (if it changes in the future), which version of Python to use, what environment markers to respect for different Python versions, and package-level specifiers for much of this information as well. Each specified dependency can have details about its sourcing (including from version control systems), hashes, source distribution info (if any), and more.

A detailed example of the pylock.toml format is available in PEP 751. Each dependency has its own [[packages]] subsection, with details about where and when it was obtained, its hash, version requirements, and so on. The example shows how a package can have multiple platform-specific sources. In the example, numpy lists binary wheels for Microsoft Windows and generic Linux systems.

When will PEP 751 lock files arrive?

As of this writing, there’s no official or third-party tool that supports pylock.toml today. Future revisions of tools, from pip on outwards, are meant to adopt pylock.toml at the pace their providers find suitable, so you can expect the new lock file to be added to workflows and to the makeup of projects over time. You will eventually want to think about whether the set of dependencies of your project is complex enough to demand a pylock.tomlfile. Anyone creating a package hosted on PyPI, for instance, will want to gravitate towards creating a pylock.toml for the project.

The speed of adoption will be dictated by how soon existing tools make it possible to generate pylock.toml files, and whether that file format will immediately replace their own internal formats. Some third-party tools that generate their own lock files remain hesitant to use pylock.toml as a full replacement. uv, for instance, intends to support pylock.toml as an export and import format, but not as the format for its native lock files due to some features not yet supported in pylock.toml. However, it’s entirely possible that support for those features could be added to future revisions of the pylock.toml spec.

(image/jpeg; 18.12 MB)

Download the Strategizing Data Analytics for AI Enterprise Spotlight 1 Apr 2025, 5:00 pm

Download the April 2025 issue of the Enterprise Spotlight from the editors of CIO, Computerworld, CSO, InfoWorld, and Network World.

(image/jpeg; 0.32 MB)

How to start developing a balanced AI governance strategy 1 Apr 2025, 11:00 am

The best defense is a good offense, right? When it comes to AI governance, organizations need both. The reason is that genAI capabilities evolve quickly, hype accelerates investments, and data risks are amplified through AI applications. This article looks at how to shore up your defense while planning and implementing a strong AI governance offense.

Developing your AI governance strategy

Key questions to address in AI governance include what important regulatory compliance is needed, what data can be used in training AI models, what data must not be shared with public LLMs, and what tools can be used to deploy AI agents. Asking these defensive questions can help protect your organization.

A strong offense guides the business objectives, outcomes, and capabilities to focus on when applying AI. Defining your offensive strategy for AI governance helps channel everyone’s efforts to areas where AI can generate business value and drive digital transformation.

“Our thinking about AI governance is often too limited, focusing only on compliance and risk reduction,” says Kurt Muehmel, head of AI strategy at Dataiku. “Governance is a strength that ensures that AI is aligned with business objectives, is produced efficiently, follows internal best practices, is designed for production from the beginning, and promotes reusing components. AI governance thought of this way becomes not an obligation but a competitive differentiator.”

Executing AI governance with defense and offense strategies requires IT and data science teams to address deficiencies in how organizations balanced innovation and governance in the past:

Many devops teams developed applications, then bolted on security until devsecops practices helped drive a shift-left security culture.
Organizations were at first afraid to use public clouds, then aggressively modernized applications for them, only to put finops disciplines in place afterward.
Ask the chief data officer (CDO) about the challenges of assigning data owners and classifying data while departments adopt citizen data science and become proficient in data analytics platforms.

Beelining to AI capabilities without instituting AI governance is a recipe for AI disaster.

The CDO leads AI governance

Many organizations view AI governance as an extension of data governance and assign the CDO to take charge of defining it. CDOs also have many data-related responsibilities that are foundational for safe and secure AI practices.

“The true opportunity with AI governance does not lie in reducing ethical, regulatory, and business risk—AI governance is needed to help organizations drive the trust and adoption necessary to transform the business using AI,” says Kjell Carlsson, head of data science strategy and evangelism at Domino. To achieve this, CDOs must provide visibility, auditability, reproducibility, and control—and implement platforms that orchestrate, streamline, and—where appropriate—automate governance activities so that people can focus on reducing risks versus wasted manual effort.”

Henry Umney, managing director of GRC Strategy at Mitratech, says the key priorities on the CDO roadmap for data and AI governance should include:

Creating a clear definition of AI within the organization and categorizing the new risks AI introduces.
Building an AI model inventory ranked by business impact and regulatory risk like the EU AI Act.
Benchmarking existing governance and risk management structures against frameworks like NIST AI RFM, adding AI-specific controls to existing frameworks across the organization.

The role of the CDO is not just to implement these practices; CDOs must prioritize them effectively and communicate to business stakeholders how governance enables efforts to deliver business value.

“A CDO’s roadmap should balance the adoption of transformative technologies like genAI with the critical need to maintain data sovereignty,” says Jeremy Kelway, VP of engineering for analytics, data, and AI at EDB. “This goes beyond risk reduction as data sovereignty covers governance, observability, and jurisdictional boundaries, laying the groundwork for offensive strategies that drive growth, sharpen competitive capabilities, and enhance customer experiences. By ensuring the appropriate data is secure and shared compliantly with a clear understanding of how the AI models use it, CDOs can confidently leverage their data for real-time insights that spark innovation.”

Bring in business leaders and other stakeholders

Your organization may need a data fabric if many data sources are used in AI models. The chief information security officer should review data security posture management (DSPM) platforms to secure confidential or regulated information stores across multiple clouds, data centers, and edge devices. Then, add observability, data pipeline, data catalog, customer data platform, and master data management capabilities.

It’s a significant enough investment to make a business leader’s head spin. Many of these tools and related practices are important to have in place and upgrade to support AI. However, the CDO and data teams better answer why and why now or risk losing business stakeholder interest.

CDOs should craft an AI vision statement, define a data strategy, and manage a roadmap aligning with a plan to drive AI offense capabilities.

“Build your data and AI strategy and a culture of rapid yet responsible AI from day one, as adding it later is much more challenging and costly,” says Ana-Maria Badulescu, senior director of the AI lab, office of the CTO at Precisely. “The CDO roadmap should go beyond governance by providing a comprehensive, integrated solution that covers data quality, data observability, data catalog, data security and privacy, data enrichment with third-party data, and location intelligence. Break down data silos by creating data governance councils and business glossaries to ensure a shared language across the organization.”

What may be obvious to data teams may be out-of-sight, out-of-mind for business stakeholders. Heather Gentile, director of product management of AI risk and compliance at IBM, suggests reinforcing that the results of a model are only as good as the data on which it is built and trained. “The transparency and explainability of governance also successfully accelerates and scales AI initiatives and business impact,” says Gentile.

Embrace new AI data governance priorities

CDOs, data governance, and data scientists must also consider AI-specific capabilities. For example, modelops is the discipline of monitoring ML models for drift and other conditions necessitating retraining. For genAI, data teams should be explicit about what data was used to train an LLM, RAG, AI agents, and other AI capabilities.

“An AI Data Bill of Materials (AI DBoM) is the foundation for responsible AI at scale and should be a part of the CDO’s governance strategy,” says Kapil Raina, data security evangelist at Bedrock Security. “An AI DBoM tracks all data feeding AI models—training, fine-tuning, and inference—ensuring quick project turnarounds with full transparency into what AI systems access and generate. Without it, CDOs are flying blind—exposed to security gaps, non-adherence to the rapidly evolving regulatory landscape, and stunted innovation.”

CIOs, CDOs, and IT teams should also recognize that what worked as a pre-AI data strategy may need enhancements and overhauls to accelerate AI opportunities. Organizations that distribute significant dataops and data quality work to business teams may want to consider centralization to drive efficiencies and consistent data quality metrics.

Rahul Auradkar, EVP and GM of Unified Data Services at Salesforce, says, “Reducing tech debt caused by different data governance controls, manually classifying and tagging data, and the rise of data-driven decision-making has increased governance priorities for CDOs today.”

Another consideration is for construction, manufacturing, and field services organizations with highly distributed operational teams that use disparate management tools and rely on spreadsheets. Their efforts to consolidate workflow tools can drive efficiencies and pave a faster path to AI capabilities.

“Without solid governance in place, your workforce must scramble to find the information they need to do their jobs because it’s trapped in tools and information silos,” says Jon Kennedy, CTO of Quickbase. Such gray work, he says, “undermines productivity and has a ripple effect on the customer experience. Through a consolidation process, IT can address tech sprawl, centralize information on a work management platform, and eliminate gray work while executing their data and AI governance roadmap.”

Elements of an offense strategy in AI governance

The practices I’ve shared so far form the defensive AI governance strategy. While a good defense is the foundation, your AI governance strategy go one step further.

“A CDO’s data and AI governance strategy should do more than manage risks—it should fuel growth, competitive edge, better customer experiences, and new market opportunities,” says Ed Frederici, CTO of Appfire. “Treating data as a revenue-generating asset and ensuring seamless interoperability can help scale the business.”

Frederici recommends the following ways to enhance your AI governance strategy with a good offense:

Drive efficiencies through AI-driven automation and internal data marketplaces.
Build customer trust and increase engagement through personalization engines and ethical AI.
Improve services by using predictive AI to anticipate needs and reduce churn.
Accelerate product development with AI-powered market insights.
Help businesses stay ahead with cross-industry collaboration and strategic data sharing.

Business stakeholders might treat data as assets when leaders extend data and AI governance to a strategy and practices for developing data products. Data products can simplify how departments reuse data and AI capabilities efficiently and create opportunities for developing customer-facing data products and collaborations.

“By treating governance as a built-in capability of data products rather than a separate control layer, organizations can accelerate innovation and time-to-value while actually improving their risk posture through standardized, reusable patterns,” says Srujan Akula, CEO of The Modern Data Company. “Data products with embedded governance controls become powerful building blocks for growth that help launch new customer solutions faster and expand into new markets more easily.”

Another opportunity is to focus data and AI governance opportunities around sales and other revenue-generating workstreams. Jason Smith, senior principal of strategy and transformation at Conga, suggests, “Data leaders must prioritize AI-driven revenue management tools that eliminate silos between departments and streamline the entire revenue process—from proposal to quote generation and more.”

Some may argue that including offense strategies as part of AI governance bleeds into the organization’s overall business, digital transformation, or AI strategy. Perhaps this is a good thing when getting business leaders excited about the offense, as it ensures they also pay attention to the defense.

(image/jpeg; 3.49 MB)

Agentic AI won’t make public cloud providers rich 1 Apr 2025, 11:00 am

While major cloud providers, such as AWS, Google, and Microsoft are rushing to position themselves as leaders in AI development and deployment, the reality of agentic AI’s impact on public cloud growth may not align with their ambitious projections—at least, in my opinion. I’ll tell you why.

Agentic AI represents more of an architectural approach than a technology requiring massive cloud resources. It enables AI systems to work independently toward goals, make decisions, and manage their resources. The distributed nature of agentic AI systems means they can operate effectively across various infrastructure types, often without needing specialized GPU clusters that cloud providers heavily invest in.

The migration patterns suggest a hybrid approach, where workloads move between on-premises environments, private clouds, and various public cloud providers, including AWS, Google Cloud Platform, and others. This flexibility in deployment options challenges the assumption that agentic AI will drive massive public cloud adoption from the big three hyperscalers.

Integration versus centralization

Agentic AI isn’t what most people think it is. When I look at these systems, I see something fundamentally different from the brute-force AI approaches we’re accustomed to. Consider agentic AI more like a competent employee than a powerful calculator.

What’s fascinating is how these systems don’t need centralized processing power. Instead, they operate more like distributed networks, often running on standard hardware and coordinating across different environments. They’re clever about using resources, pulling in specialized small language models when needed, and integrating with external services on demand. The real breakthrough isn’t about raw power—it’s about creating more intelligent, autonomous systems that can efficiently accomplish tasks.

The big cloud providers emphasize their AI and machine learning capabilities alongside data management and hybrid cloud solutions, whereas agentic AI systems are likely to take a more distributed approach. These systems will integrate with large language models primarily as external services rather than core components. This architectural pattern favors smaller, purpose-built language models and distributed processing over centralized cloud resources. Ask me how I know. I’ve built dozens for my clients recently.

The diverse landscape of modern IT infrastructure offers ideal platforms for deploying agentic AI systems. Regional providers, sovereign clouds, managed services, colocation facilities, and private clouds can provide more cost-effective and flexible alternatives to major public clouds.

This distributed approach aligns perfectly with agentic AI’s need for edge computing, local processing, and hybrid architectures. Organizations can now build scalable AI solutions that leverage the right mix of infrastructure while maintaining control over costs, performance, and data sovereignty.

The efficiency of these distributed approaches is evident in how they handle data changes and processing. Modern systems can achieve near-continuous, block-level operations while integrating directly with storage subsystems, avoiding unnecessary I/O operations. This efficiency often makes smaller, specialized providers more attractive than hyperscalers.

Future growth trajectory

Although developments such as SQL Server 2025’s integration of generative AI capabilities across on-premises and cloud environments show promise, the growth pattern for hyperscalers may not match their expectations. The distributed nature of agentic AI, combined with the need for cost-effective, specialized solutions, suggests that growth will be spread across a broader ecosystem of providers rather than concentrated among the major public cloud platforms.

The future may resemble a bridge architecture where various components act as intermediaries between different environments. However, they must orchestrate resources and capabilities across multiple providers and platforms. This approach prioritizes flexibility and efficiency over vendor consolidation, potentially limiting the dominance of any single major hyperscaler in the agentic AI space.

AWS, Google Cloud, and Microsoft Azure will certainly play important roles in the agentic AI landscape, but their position may be more as components of broader, more distributed architectures rather than as central, dominant platforms. Organizations implementing agentic AI solutions will likely adopt multiprovider strategies that optimize for specific requirements, costs, and performance needs rather than consolidating with a single hyperscaler.

As enterprises reevaluate their AI strategies, many are reconsidering their reliance on public cloud providers. The rapidly rising costs of running AI workloads on hyperscaler infrastructure have caught businesses off guard, especially when combined with the sticker shock of generative AI systems. For organizations that moved to the cloud a decade ago, expectations of cost savings have been upended, leading many to explore alternatives.

At the same time, the cost of on-premises infrastructure has fallen significantly. With the greater affordability of owned or leased hardware and the availability of modern colocation providers and managed services, enterprises no longer need to manage the daily operations of a data center. This shift gives businesses cost control and flexibility without sacrificing scalability or performance.

The hyperscalers must now rethink their position in the AI ecosystem. As the market for AI infrastructure normalizes, enterprises are looking for the most efficient blend of cloud, colocation, MSP, purpose-built clouds, and on-premises solutions. Organizations prioritize sustainability, sovereignty, and resource efficiency over legacy assumptions about public cloud dominance. For hyperscalers, that means embracing this shift and adapting their offerings to remain relevant during this transition—though some initial pain is inevitable as the industry adjusts.

(image/jpeg; 0.12 MB)

Java plan prepares to restrict final field mutation 1 Apr 2025, 11:00 am

A JDK Enhancement Proposal (JEP) to prepare to make final mean final in Java would issue warnings about uses of deep reflection to mutate final fields. The warnings would prepare Java developers for a future release that ensures integrity by default by restricting final field mutation, making Java safer and potentially faster.

The proposal was created in early-February and updated on March 31. A key goal of the plan is to prepare the Java ecosystem for a future release that by default disallows mutation of final fields by deep reflection. As of that release, developers will have to explicitly enable the capability to mutate final fields at startup. Other goals include aligning final fields in normal classes with components of record classes, which cannot be mutated by deep reflection, and allowing serialization libraries to continue working with Serializable classes, even those with final fields. There is no plan to deprecate or remove any part of the Java Platform API or prevent mutation of final fields by serialization libraries during deserialization. The JEP currently does not list a version of Java that would get the final-means-final capability.

Detailing motivation for the plan, the proposal says Java developers rely on final fields to represent immutable state. The expectation that a final field cannot be reassigned, whether deliberately or accidentally, is often crucial when developers reason about correctness. But the expectation that a final field cannot be reassigned is false. The Java platform provides APIs that allow reassignment of final fields at any time by any code in the program, thus undermining reasoning about correctness and invalidating important optimizations. Thus a final field is as mutable as a non-final field.

Although relatively little code mutates final fields, the existence of APIs for doing this makes it impossible for developers or the JVM to trust the value of any final field. This compromises safety and performance of all programs, according to the proposal. Plans call for enforcing the immutability of final fields so that code cannot use deep reflection to reassign them at will. One special case—serialization libraries needing to mutate final fields during deserialization—will be supported via a limited-purpose API.

(image/jpeg; 1.62 MB)

New Python lock file format will specify dependencies 1 Apr 2025, 1:10 am

Python’s builders have accepted a proposal to create a universal lock file format for Python projects that would specify dependencies, enabling installation reproducibility in a Python environment.

Python Enhancement Proposal (PEP) 751, accepted March 31, aims to create a new file format for specifying dependencies that is machine-generated and human-readable. Installers consuming the file should be able to calculate what to install without needing dependency resolution at install-time, according to the proposal.

Currently no standard exists to create an immutable record, such as a lock file, that specifies what direct and indirect dependencies should be installed into a Python virtual environment, the proposal states. There have been at least five well-known solutions to the problem in the community, including PDM, pip freeze, pip-tools, Poetry, and uv, but these tools vary in what locking scenarios are supported. ”By not having compatibility and interoperability it fractures tooling around lock files where both users and tools have to choose what lock file format to use upfront, making it costly to use/switch to other formats,” the proposal says.

Human readability of the file format enables contents of the file to be audited, to make sure no undesired dependencies are included in the lock file. The file format also is designed to not require a resolver at install time. This simplifies reasoning about what would be installed when consuming a lock file. It should also lead to faster installs, which are much more frequent than creating a lock file.

The format has not yet been associated with a specific release of Python, but is guidance for tooling going forward. Actual adoption remains open-ended. Acceptance of the format is full and final, not provisional. The universal format has been the subject of an estimated four years of discussion and design.

InfoWorld Senior Writer Serdar Yegulalp contributed to this report.

(image/jpeg; 0.46 MB)

Apple’s Swift language gets version manager 31 Mar 2025, 9:33 pm

Apple has introduced swiftly 1.0, a version manager for the Swift programming language that is intended to ease the process of installing, managing, and updating the user’s Swift tool chain.

Swiftly 1.0 was announced March 28. While swiftly has been available for some years as a community-supported tool for Swift developers using Linux, the swiftly 1.0 release makes it an officially supported part of the core Swift tool chain. The project is now hosted in the Swift GitHub organization. Apple has added macOS support to swiftly make it easier to install Swift separately from the Xcode development environment.

The swiftly tool can install the standalone Swift toolchain, providing commands to install Swift on a new system, update to the latest stable version, and experiment with nightly snapshots or older versions. Written in Swift, swiftly also makes it easy to switch between multiple installed tool chains. By adding a file to a project repository, developers can configure swiftly to use the same Swift tool chain version for all members of the development team.

Documentation on swiftly is at swift.org. To retrieve the latest Swift release, swiftly uses the Swift OpenAPI plugin to generate code to interact with the swift.org website. Plans call for having swiftly become the default way to install Swift outside of Xcode. The initial version supports macOS and Linux distributions such as Ubuntu, Debian, Red Hat Enterprise Linux, and Fedora.

(image/jpeg; 1.24 MB)

How AI is transforming IDEs into intelligent development assistants 31 Mar 2025, 11:00 am

Ever feel like you’re spending more time squashing bugs than actually building something? You’re not alone—developers spend a whopping 35% of their time debugging and reviewing code instead of writing it. That’s like ordering pizza and only eating the crust. Enter AI-powered IDEs, the new coding sidekicks that automate drudgery and help you focus on writing code.

For years, the trusty IDE has been the MVP of every developer’s toolkit. Now the familiar, feature-packed integrated development environment is adding a next-level assistant that can speed things up, handle the boring stuff, catch errors before they bite, and free up developers’ brains for creative problem-solving.

In this article, we’ll explore how AI is leveling up IDEs, transforming them with smarter debugging and automated refactoring and even lending a hand with decision-making. Whether you’re a seasoned pro or leading the charge on your team, these insights will show you how an AI-driven IDE can help you stay ahead in a world where innovation is everything.

Traditional IDE features

Long before the advent of AI-driven tools, IDEs played a pivotal role in transforming developers’ work. By consolidating essential tools into a single platform, early IDEs helped developers move away from tedious, manual workflows and focus on actual problem-solving. These foundational features laid the groundwork for today’s modern, AI-powered capabilities.

Syntax highlighting and code formatting

One of the earliest productivity boosters was syntax highlighting, which made reading and writing code significantly more manageable. By visually differentiating keywords, variables, functions, and other code elements with distinct colors, developers could quickly understand code structure, spot typos, and reduce errors. Combined with automatic code formatting, which ensured consistent indentation and styling, these features helped maintain clean, readable code bases—especially crucial in large projects with multiple contributors.

Code compilation and execution

Early IDEs streamlined the process of writing, compiling, and executing code by bundling these steps into a single workflow. Instead of manually running separate compilers and debuggers from the command line, developers could write their code, hit a button, and instantly see the results. This rapid feedback loop allowed for quicker iterations and more experimentation, reducing the time it took to test new ideas or fix bugs.

Integrated debuggers

Debugging used to be a labor-intensive process, often involving manually combing through logs or adding print statements. Early IDEs revolutionized this by integrating visual debugging tools. Features like breakpoints, step-through execution, and variable inspection gave developers more insight into the runtime behavior of their code. This enabled them to diagnose and resolve issues more efficiently, paving the way for faster, more reliable software development.

Search and navigation tools

As projects grew larger, navigating through hundreds or even thousands of lines of code became increasingly challenging. Early IDEs addressed this with powerful search tools that allowed developers to quickly locate variables, methods, or files within a project. Features like “go to definition” and “find all references” helped developers understand how different parts of their code base interacted, saving hours of manual searching.

Code templates and snippets

Early IDEs introduced templates and snippets to reduce repetitive coding tasks. These were predefined chunks of code that could be quickly inserted into a project. Whether it was a boilerplate class definition, a common function, or a frequently used design pattern, these templates made it easy to adhere to coding standards while reducing the effort required to write repetitive structures.

Version control integration

With the rise of collaborative development, version control became essential for managing code changes. Early IDEs began integrating tools like Git and SVN, allowing developers to commit, branch, and merge code directly within the IDE. This not only improved collaboration but also minimized the friction of switching between different tools for version control and development.

Plugin ecosystems

While core features addressed general productivity needs, early IDEs embraced extensibility through plugin ecosystems. Developers could customize their environments by adding tools tailored to their specific languages, frameworks, or workflows. This flexibility made IDEs more adaptable and allowed them to stay relevant as development practices evolved.

These early innovations fundamentally changed how developers approached their work, transforming what used to be time-consuming tasks into streamlined processes. While modern AI-powered features take productivity to unprecedented levels, it’s essential to recognize the foundational tools that first enabled developers to work faster, write cleaner code, and collaborate more effectively. These features not only improved individual productivity but also set the stage for the sophisticated capabilities of today’s intelligent IDEs.

AI-powered features of intelligent IDEs

While the software world is speeding up like a race car, most developers are stuck in traffic, dealing with repetitive tasks like debugging, scrolling through endless code, or fixing tiny mistakes. AI redefines what it means to have a “smart” development environment, turning traditional IDEs into full-fledged intelligent development assistants. These modern tools aren’t just about editing and compiling code—they’re about streamlining workflows, automating repetitive tasks, and empowering developers to work smarter, not harder.

Here are some standout features that showcase the transformative power of intelligent IDEs.

Code explanation

For developers, understanding unfamiliar or legacy code can take time and effort. Intelligent IDEs with AI-driven code explanation capabilities make this process much more manageable. These tools can analyze code blocks and provide plain-language summaries, describing what the code does without requiring developers to decipher every line. This feature is especially valuable when working with large code bases, where clarity can save hours of effort.

Imagine being handed a project full of legacy code with minimal documentation. Instead of combing through every file, you could use your IDE to highlight a section and instantly get an explanation, including the logic and intent behind the code. This is not only a time-saver but also a game-changer for team onboarding and collaboration, helping new members get up to speed without spending weeks piecing together the code’s purpose.

Moreover, this feature extends its usefulness in debugging and refactoring tasks. When code explanations are combined with AI insights, developers can quickly spot areas of inefficiency or logical errors. By bridging the gap between raw code and human understanding, intelligent IDEs make even the most complex projects more approachable.

Intelligent code completion

Gone are the days when code completion merely suggested the next word. AI-powered IDEs understand the broader context of a project, analyzing tasks, coding styles, and application architecture to suggest improved code, complete functions, or structural changes. This contextual awareness enables developers to write code more quickly and accurately.

Furthermore, these intelligent code completion tools promote consistency within development teams. Providing code snippets or patterns that adhere to project standards helps maintain a uniform coding style despite varying experience levels among team members. This leads to accelerated development, simplified maintenance, and reduces errors over time.

Proactive debugging assistance

Debugging can often feel like a high-stakes detective game, where each error message is a clue waiting to be pieced together. Imagine diving into the complexities of your code, equipped with the sharp instincts of a seasoned detective, tracking down elusive bugs that threaten to derail your project. This is where the magic of AI-powered IDEs comes into play, transforming debugging from a tedious chore into an exhilarating experience. These intelligent tools meticulously analyze error messages, scouring your code for potential pitfalls and illuminating the dark corners where issues may hide. They don’t just point out problems, though. They also offer insightful suggestions for fixes, allowing you to tackle challenges head-on before clicking “Run.”

Automated documentation and testing

Documentation and testing, often seen as burdensome tasks in software development, are significantly eased by intelligent IDEs. These tools automate large parts of the process, sparing you from hours of tedious writing. With automated documentation, they can generate comments, inline explanations, or even complete API documentation based on your code, ensuring your project remains well-documented without the usual drudgery.

Consider an AI-powered IDE that can analyze your chosen method and automatically create a comment block summarizing its functionality, inputs, and outputs. This feature is particularly beneficial in collaborative settings where clear documentation is crucial for effective teamwork. By generating this baseline automatically, developers can concentrate on enhancing and expanding the documentation to include edge cases or nuanced details, rather than starting from scratch. This not only saves time but also ensures that everyone on the team is on the same page.

Intelligent IDEs also simplify the creation of unit tests by automatically analyzing your code and generating test cases. For example, the IDE might create a suite of tests for a function, covering various scenarios like edge cases, typical usage, and invalid inputs. These automated tests save time and significantly improve code quality by encouraging more thorough testing coverage. Developers can then refine and expand these tests to address more complex cases, creating a robust safety net for the application.

Streamlined refactoring

Refactoring is essential for maintaining clean and efficient code, yet it can often be time-consuming and prone to errors. Intelligent IDEs enhance the refactoring process by analyzing the entire code base and offering suggestions to improve structure and readability. They can identify redundant code, propose optimizations, and recommend alternative implementations for complex logic while ensuring that existing functionality remains intact. For example, suppose a developer faces repetitive code patterns across multiple files. In that case, an AI-powered IDE might suggest consolidating them into a reusable function. Or if a method is excessively long, an intelligent IDE might recommend dividing it into smaller, manageable parts.

The true strength of AI-driven refactoring lies in its ability to scale insights throughout the entire project. Whether renaming variables, reorganizing class hierarchies, or optimizing database queries, intelligent IDEs deliver actionable suggestions that save developers significant time. Automating many tedious aspects of refactoring enables developers to focus on strategic improvements, resulting in cleaner and more performant code while reducing stress and increasing confidence in their work.

Seamless workflows

One of the most impressive aspects of AI-powered IDEs is how seamlessly these features integrate into your workflow. There’s no need to juggle multiple plugins or external tools—everything from debugging insights to documentation generation is built-in and ready to use.

Smarter tools, smarter developers

IDEs like Apple’s Xcode, JetBrains’ Rider, and Microsoft’s Visual Studio are just the beginning of the IDEs that are becoming more intelligent with the help of AI. In the future we can expect to see IDEs that predict bottlenecks before they happen or recommend best practices tailored to your team’s workflow. These aren’t far-off dreams—they’re the next steps for intelligent development environments.

Whether you’re tackling a complex code base, dealing with legacy systems, or building something entirely new, intelligent IDEs are reshaping how developers approach their work. These tools free up time for the creative and problem-solving aspects of development by automating routine tasks and reducing friction, enabling teams to focus on delivering innovative, high-quality software.

The future of coding is here: more intelligent, more innovative, and way more exciting.

Chris Woodruff — or as his friends call him, Woody — is a software developer and architect now working as a solution architect at Real Time Technologies. You can find more about him at https://woodruff.dev.

—

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 28.52 MB)

14 alternative managed Kubernetes platforms 31 Mar 2025, 11:00 am

Kubernetes is mighty powerful but highly complex. This has influenced many organizations to ditch self-hosted solutions and move toward more fully managed Kubernetes platforms. Nearly 90% of Kubernetes users use cloud-managed services, DataDog reported in 2021.

The top cloud hyperscalers each have their own managed Kubernetes platforms: Amazon Elastic Kubernetes Service (EKS), Azure Kubernetes Service (AKS), and Google Kubernetes Engine (GKE). But that’s just the beginning. Managed Kubernetes services are available far and wide, from long-established technology companies and up-and-coming startups alike. The Cloud Native Computing Foundation (CNCF) tracks over 100 certified vendors.

While much has been said about the major public cloud offerings, interest in alternative managed Kubernetes platforms is growing. Many are specialized for unique environments, like edge, on-premises, or smaller container deployments. Some offerings unlock managed infrastructure at a fraction of the cost of the bigger cloud players.

While large incumbents stagnate and hike prices, newcomers are pushing innovation—especially with agnostic layers that manage clusters across multiple public clouds or private clouds. Below, we’ll explore the range of managed Kubernetes platforms and compare some of the top offerings for small-to-medium businesses as well as enterprises.

Alibaba Container Service for Kubernetes (ACK)

Alibaba Cloud Container Service for Kubernetes (ACK) provides many out-of-the-box functions to streamline cluster management and containerized application deployment, offering high scalability on Alibaba Cloud infrastructure.

Key benefits include a native VPC for secure networking and instant access to GPU-accelerated instances. Developers can deploy across multiple geographic zones and test releases using built-in canary and blue-green deployment strategies. ACK is Kubernetes-conformant and includes professional support services.

ACK prioritizes VPC networking, lacking native support for modern eBPF-based CNIs like Cilium (although it offers plugins). It also relies on Alibaba Cloud services for logging, storage, and RBAC (role-based access controls)—great for Alibaba Cloud users, not so great for those seeking open-source flexibility.

If you’re operating in the Asia-Pacific region or you already have ties to Alibaba Cloud, ACK is an obvious consideration.

D2iQ Kubernetes Platform (DKP)

D2iQ Kubernetes Platform (DKP), now owned by Nutanix, evolved from Mesosphere, which originally bundled the now-defunct Apache Mesos, an orchestrator that didn’t fare too well in the container wars. After pivoting to Kubernetes, DKP introduced automation for managing clusters across clouds.

DKP’s control plane centralizes visibility across clusters, aiding troubleshooting and root cause analysis. It’s CNCF-conformant, with declarative APIs and standard kubectl commands. DKP also meets NSA/CISA guidelines and supports air-gapped deployments.

Most users find DKP to be a robust and reliable platform for production-grade cluster management, simplifying database provisioning, CI/CD, backups, vulnerability scanning, and monitoring. However, some cite poor documentation, unresponsive support, and a steep learning curve for initial setup. Others report salespeople lacking product knowledge.

DKP is a good choice for cross-cloud organizations with a strong GitOps culture looking for a simpler way to run multiple clusters across various deployment environments.

DigitalOcean Kubernetes (DOKS)

DigitalOcean Kubernetes (DOKS) is a managed Kubernetes platform with a fully managed control plane accessible via UI, API, or CLI. It abstracts infrastructure management, offering automated high availability, autoscaling, and backups with SnapShooter.

Users appreciate its intuitive web interface, streamlined CLI, and easy onboarding. Updates take just a few commands, and native load balancers and volumes integrate seamlessly. Per-node costs tend to be lower than hyperscalers. GPU workloads are supported via manually deployed Nvidia-enabled droplets.

Early security issues, like publicly exposed etcd, have been resolved. However, DOKS lacks built-in Network File System (NFS) support for distributed storage—while workarounds exist, some platforms offer native options. Virtual private cloud (VPC) networking is available but less flexible than AWS or GCP.

DOKS continues to improve and is a solid choice for smaller-scale Kubernetes workloads like APIs, worker nodes, or log processing.

IBM Cloud Kubernetes Service (IKS)

IBM Cloud Kubernetes Service (IKS) is a managed Kubernetes platform for deploying containerized applications on IBM Cloud. It offers advanced scheduling, healing, monitoring, and user-friendly deployment tools.

Users find workload scalability and high availability as key advantages of IKS. Its resource isolation can also help support highly regulated environments. CNCF-certified, IKS provides predictable Kubernetes API behavior, a native container registry, and integration with other IBM services, including Watson.

Unlike multi-cloud-focused platforms, IKS is IBM Cloud-specific, limiting its role in agnostic container orchestration. Stellar opinions of IBM Cloud are rare, with developers citing high costs, troubleshooting difficulties, and gaps in documentation. With IKS specifically, experiences vary.

If you’re already using IBM Cloud and need tight integration with IBM services, IKS may be a good option. Despite mixed feelings about IBM Cloud, reviews say that IKS gets the job done.

Kubermatic Kubernetes Platform

Kubermatic Kubernetes Platform (KKP) is a managed Kubernetes distribution optimized for edge constraints like low bandwidth or low processing power. It is highly portable, supporting hybrid multi-cloud, data centers, and multi-tenant environments.

KKP includes built-in automation for scaling, healing, provisioning, updates, and backups. It’s CNCF-certified, so adheres to Kubernetes-native commands, and a self-managed open-source community version is available under Apache License 2.0.

KKP’s Container Network Interface (CNI) support was originally limited to Canal but now supports Cilium and others. While KKP has a smaller user base than major managed Kubernetes services, it is a significant upstream Kubernetes contributor.

KKP clusters are pretty vanilla for a managed platform. If you want a Kubernetes-native managed platform with high customization for your containers on the edge, KKP is a good option.

Linode Kubernetes Engine (LKE)

Akamai’s Linode Kubernetes Engine (LKE) is a managed platform for deploying containerized applications without maintaining a Kubernetes cluster. It features a fully managed control plane and programmatic ways to provision clusters.

Users praise LKE’s ease of use, high-quality customer support, and transparent pricing—you pay only for worker nodes, and inbound transfers are free. LKE guarantees 99.99% uptime, supports multiple Kubernetes versions, and enables quick add-on tools.

However, LKE lacks some advanced features found in larger platforms. Notably, it doesn’t offer a private container registry, and while the dashboard is managed, users must handle deployment configurations. (Akamai App Platform is a more ready-to-run platform, but it’s still in beta). GPU workloads require additional product subscriptions.

As Swapnil Bhartiya writes for TFiR.io, LKE is “designed for typical cloud users,” serving both small teams and enterprises. However, its simplicity leads many to categorize it with smaller providers. Elliot Graebert describes LKE as “best for startups.”

Mirantis Kubernetes Engine (MKE)

Mirantis Kubernetes Engine (MKE), formerly Docker Enterprise, is a Kubernetes-based container orchestration platform that runs on bare metal, public cloud, and everything in between. Built on open source, it includes Calico for networking and KubeVirt for managing both containers and VMs.

CNCF-certified, MKE offers flexible access via web UI, CLI, or API. Users appreciate its ease of use, strong security controls, unlimited worker nodes, and an internal trusted registry for container images.

However, some folks are wary of Mirantis’s open-source commitment due to its decision to make Lens, a popular Kubernetes dashboard, closed-source, and its track record of productizing free Docker-related tools. Reviews also cite customer support issues and opaque documentation.

For those embedded in the Docker ecosystem who value simplicity over full flexibility, MKE remains a solid choice.

Oracle Kubernetes Engine (OKE)

Oracle Kubernetes Engine (OKE) is a Kubernetes management platform built on top of Oracle Cloud Infrastructure (OCI). In addition to automating maintenance, patching, and repairs, OKE supports autoscaling and efficient resource utilization.

OKE is extensible, providing cluster add-ons for areas like service mesh, cluster autoscaling, metrics, ingress controller, load balancing, and more, making it good for organizations intermeshed with the Oracle ecosystem. While Oracle generally draws enterprise users, even some indie developers are turning to OCI’s Always Free tier for self-hosted projects.

Some Oracle Cloud users report sudden closures for accounts on the free tier and not the pay-as-you-go plan. Others report a clunky experience using the OCI user interface. As Arnold Gálovics, co-founder and CEO at Docktape Technologies, writes, “The Oracle Cloud console interface is a big step back.”

But if you’re looking for an enterprise-focused Kubernetes management layer with a generous free tier, OKE fits the bill.

Platform9 Managed Kubernetes

Platform9 Managed Kubernetes (PMK) is a fully managed Kubernetes service that automates cluster operations like deployment, monitoring, healing, and upgrades. Cloud-agnostic, PMK runs across on-premises, edge, and public clouds.

Users report successful scaling with PMK. It includes multi-cluster management, multi-tenancy, and granular RBAC policies, with Platform9’s Always On Assurance guaranteeing high availability. Closer to upstream Kubernetes than competitors, PMK offers a solid developer experience.

However, PMK lacks built-in private registries, and its CNI support is limited (albeit customizable via plugins). Its cloud-hosted control plane may not suit strict on-prem compliance needs. Although users report cost savings, exact pricing details are opaque.

For enterprises seeking private-cloud Kubernetes, PMK is a strong alternative to Red Hat OpenShift or VMware Tanzu.

Rackspace Managed Kubernetes (MPK)

Rackspace Managed Platform for Kubernetes (MPK) is powered by Platform9, providing a unified control plane for cluster deployment, monitoring, incident response, and upgrades.

MPK supports three environments—Rackspace bare metal, AKS, and EKS. Unique benefits include an SLA guaranteeing Kubernetes upgrades within 120 days of end of life and dedicated support, with a pod of Kubernetes-certified engineers assisting each customer.

Highly Kubernetes-conformant, MPK integrates CNCF-backed tools like Prometheus, Fluentd, Helm, and Istio. However, MPK lacks a native container registry, IAM, and storage, requiring public cloud or bring-your-own solutions.

MPK is a solid choice for teams using Rackspace for bare-metal hosting who want hands-on support and a centralized platform to manage AWS’s and Azure’s Kubernetes services.

Rancher

Rancher, by SUSE, is a Kubernetes-as-a-service solution for on-premises and cloud. Rancher can manage clusters for several Kubernetes platforms including Rancher Kubernetes Engine (RKE), K3s, AKS, EKS, and GKE.

Developers tend to find Rancher’s unified web UI easy to get started with. Rancher also provides an API and CLI and supports Terraform with its own slice of GitOps. It ships with secure administrative controls, including OAuth and other login options. Rancher has a large user base and Slack groups, making it easy to find community and support.

SUSE has introduced price hikes in recent years. Engineers have also reported performance and scalability challenges with Longhorn, SUSE’s native storage solution, and often recommend alternatives for backup storage. Rancher also supports K3k, Kubernetes in Kubernetes, which allows you to run isolated K3s clusters within a larger Kubernetes environment.

All in all, Rancher is comparable to OpenShift, but less opinionated and more modular, with a different approach to multi-tenancy. If you need multi-cloud, multi-cluster management with fewer vendor restrictions, Rancher is a solid choice. Portainer is a comparable alternative.

Red Hat OpenShift Kubernetes Engine

Red Hat OpenShift is a hybrid cloud platform that streamlines Kubernetes with a developer tool chain, simplifying cluster management. It includes built-in observability, networking, security, and GitOps, making upgrades and patches easier than stand-alone Kubernetes. Unlike cloud-specific services, OpenShift is portable, running on-prem, in data centers, or across clouds.

OpenShift OpenShift Kubernetes Engine is a more pared-down version of OpenShift, offering a managed Kubernetes environment without the higher-level platform-as-a-service (PaaS) layer. It also supports Kubernetes Operators and running virtual machines alongside containers.

A potential downside is that OpenShift is much more opinionated compared to other services like AKS. It favors its own oc CLI over kubectl, and some Helm charts and Operators may need adjustments due to its stricter security model.

OpenShift is suitable for on-prem deployments, hybrid teams managing VMs and containers, and Red Hat customers. If a portable, enterprise-ready Kubernetes distribution with built-in security and automation is what you seek, OpenShift is a strong contender.

Scaleway Kubernetes Kapsule

Scaleway, an EU-based cloud provider, offers Kubernetes Kapsule, a managed Kubernetes service focused on autoscaling and resilience. Scaleway also provides Kosmos for multi-cloud Kubernetes deployments.

Kapsule features a sleek UX, strong customer support, and flexible cluster management via API, CLI, and Terraform. You pay only for the nodes you use, making it cost-effective for personal clusters or experimentation. Scaleway’s application library includes pre-configured images for common add-ons. Kapsule is also CNCF-certified, ensuring compliance with standard Kubernetes APIs.

A major drawback with Scaleway is the few regions it supports—only France, the Netherlands, and Poland—hindering a truly global reach. Some find the lack of certain features, like advanced load balancing and DNS, to be a deal breaker. Users also report slow provisioning times, outages, and reliability issues.

Due to its limited feature set and geographic distribution, Kapsule seems best-suited to side projects and EU-based startups needing an affordable option that aligns with European data protection regulations.

VMware Tanzu Kubernetes Grid (TKG)

VMware’s Tanzu Kubernetes Grid (TKG) is a Kubernetes platform that streamlines networking, authentication, monitoring, logging, and ingress control. Built partially on open source, it leverages Cluster API to manage multiple clusters. TKG performs well and offers both CLI and UI options.

However, TKG is no longer multi-cloud—since v2.5, TKG dropped support for AWS and Azure workloads. Now focused almost entirely on VMware vSphere, it’s unsuitable as an agnostic Kubernetes control plane. Managing Kubernetes across clouds requires Tanzu Mission Control (TMC) alongside native services like EKS, AKS, or GKE.

Another challenge is Tanzu’s convoluted branding and documentation—even VMware employees struggle to explain its SKUs. Meanwhile, Broadcom’s takeover of VMware resulted in steep price hikes. VMware has also deprecated a handful of Tanzu packages, raising questions about VMware Tanzu’s long-term viability.

If you’re deeply invested in vSphere and virtual machines, you can absorb higher costs, and you don’t need a true multi-cloud solution, TKG may be a fit. Otherwise, more flexible and more future-proofed alternatives exist.

Honorable mentions

There are countless other managed Kubernetes platforms, and more continue to emerge alongside niche clouds, like Hetzner or Spectro Cloud. Other comparable, fully-faceted Kubernetes managed platforms include OVHCloud Managed Kubernetes and Civo Kubernetes.

Tencent Kubernetes Engine (TKE) and Huawei Cloud Container Engine (CCE) are other options for those in the Asia-Pacific region.

The big players also offer their own stripped-down flavors of Kubernetes management. For instance, AKS Automatic and EKS Auto Mode provide frictionless developer experiences to automate cluster deployments and operations. And Google Cloud Anthos is emerging as a hybrid multi-cloud solution.

Plenty of other solutions specialize in more niche Kubernetes management functions. For instance, other managed services focus on universal control for multi-cluster, multi-cloud management, such as Portainer, Rafay, Omni, Liquo, and Kube Clusters.

For the edge or small container deployments, slim options include MicroK8s, K3s, and K0s, while Vultr Kubernetes Engine provides a more managed experience. Kubespray, a developer favorite, provides an open-source toolset for deploying Kubernetes clusters in slim configurations.

The right tool for the job

Plug-and-play Kubernetes services take a lot of the hassle out of cluster management. But it all depends on scale—managed services may not be necessary if you’re not running many clusters simultaneously. Smaller deployments often opt for simpler container runtimes, like Docker Compose or Nomad. Others turn to platform-as-a-service alternatives, like Heroku, Fly.io, or Cloud Run.

Depending on your needs, you may only require specific tools. For instance, Karpenter is a popular open-source tool strictly for autoscaling cluster nodes. Or, perhaps you only need a dashboard like Devtron or a UI like Aptakube or Octant.

Alternatively, building may be better than buying if you anticipate needing more granular control at the infrastructure level and you have the wherewithal. With the right technical chops, you might consider sticking with built-in kubeadm and hosting Kubernetes yourself for ultimate control.

Evaluating managed Kubernetes services boils down to two main factors: how much you want to manage and what third-party services you need, says Michael Levan, principal consultant and advisor. While automated services can remove infrastructure management, they may not always integrate well with certain third-party tools. “It really comes down to, like anything in the cloud, how much control you want to give away,” he says.

(image/jpeg; 0.8 MB)

How Terraform is evolving infrastructure as code 31 Mar 2025, 11:00 am

Last week I talked up OpenTofu—and for good reason. The Terraform fork has moved beyond being the darling of the infrastructure-as-code (IaC) community to becoming a real, if early, enterprise contender. However, in so doing, I inadvertently threw shade at the market leader, Terraform from HashiCorp. It’s worth remembering that while individual products or projects can move fast, industries tend to move slowly. We’re just a decade or so into the latest evolution of the IaC market, with tens of billions of dollars in play and an IT management market that is much more expansive than the IaC category. It’s this larger market that HashiCorp has been tackling, with a suite of services that complement Terraform and that will pair with IBM Red Hat products like Ansible.

Against this backdrop, open source challengers such as OpenTofu and Pulumi, or cloud heavyweights like AWS CloudFormation, are credible causes for concern, but HashiCorp is playing the long game and transforming itself into a strategic vendor that does more than IaC. Let’s take a look at its odds.

Telling a holistic story

Infrastructure as code is a way of declaring system resources as code, making it easier to ensure repeatability, auditability, and portability across software infrastructure. Terraform has become the bedrock for managing IaC, an essential tool for tens of thousands of organizations navigating the complexities of modern cloud deployments. Terraform’s influence is strong and growing within IaC and configuration management, but its presence is less pronounced in the (much) more expansive IT management software market.

That’s a good thing. It means there’s lots of room to grow. It also means there’s all sorts of competition.

Within the IaC space, Pulumi has emerged as a notable competitor by offering a developer-centric approach that allows infrastructure to be defined using familiar programming languages such as Python, Go, and JavaScript, as I’ve written. This resonates with development teams that might perceive Terraform’s HashiCorp Configuration Language (HCL) as an additional learning curve. For organizations deeply invested in AWS, AWS CloudFormation provides a tightly integrated and AWS-native IaC solution. And, as I wrote last week, OpenTofu has stormed onto the scene with an open source, community-driven, and increasingly innovative spin on IaC. None of these are simple challenges for Terraform.

But with the exception of AWS, other vendors are not matching HashiCorp’s more expansive approach to IaC. For example, HashiCorp has responded to these competitive pressures by emphasizing a comprehensive approach to infrastructure life-cycle management (ILM), viewing Terraform as a solution that spans the entire life cycle of infrastructure, from initial setup (day 0) to deployment (day 1) to ongoing management and security (day 2). This holistic perspective caters to the evolving needs of organizations as their cloud infrastructure matures.

HashiCorp also offers features like Terraform Stacks for managing complex multicomponent deployments, module life-cycle management to streamline upgrades and deprecations, and enhanced tagging for improved organization and governance within Terraform. Couple this with tight integration into other HashiCorp products such as Vault (secrets management), Consul (service discovery and connectivity), and Nomad (workload orchestration), and enterprises walk away with a more complete solution for managing and securing cloud infrastructure than they’d get from Terraform alone—or a competitive offering.

Customers are listening

This approach is working. BT Group, a leading telecommunications company, exemplifies how HashiCorp has blended its holistic product approach with key partners, in this case AWS, to meet customer demands that go beyond vanilla IaC. Working with HashiCorp and AWS, BT Group saw deployment times plummet from several days to just 10 minutes, facilitating the migration of 70 applications from their on-premises data center to AWS, resulting in improved scalability and significant cost efficiencies. BT Group didn’t just use Terraform; they also pulled in Nomad for application orchestration, Consul for service connectivity, and Vault for data security.

Other customers have used HCP Terraform to modernize their legacy applications, significantly decreasing the time required to provision new environments from weeks to hours. OXY reported a remarkable 90% improvement in developer productivity by leveraging Terraform, Packer, and Vault; Toyota successfully scaled its cloud onboarding processes using HCP Terraform and AWS Control Tower Account Factory for Terriform (AFT). Clearly there is room for up-and-comers like OpenTofu without disrupting Terraform’s continued success in the enterprise, given HashiCorp’s comprehensive approach to ILM.

What about IBM’s acquisition of HashiCorp? On paper, this strategic move could provide customers with a more comprehensive and integrated platform for managing their increasingly intricate hybrid and multicloud environments. There’s strong potential synergy between Terraform for infrastructure provisioning and IBM Red Hat’s Ansible for configuration management, further extending that ILM vision. And maybe IBM, which has a long history with open source, will make it easier for HashiCorp to revert to its original open source licensing.

Regardless, it’s unclear that the license is the primary determinant of HashiCorp’s success. Rather, being a one-stop shop for infrastructure provisioning and management will likely be the more enticing factor for enterprises that have more money than time. They just need stuff to work.

Navigating a cloudy future

Despite facing increasing competition and working a significant shift in its licensing model, HashiCorp’s Terraform remains the dominant force in the IaC market, supported by its robust feature set, extensive multicloud compatibility, and vibrant partner ecosystem.

It’s not that OpenTofu, Pulumi, and other open source options aren’t a threat. They are. But HashiCorp has evolved to support more complex enterprise requirements, making an attack on Terraform or any particular product less effective. HashiCorp’s strategic emphasis on the entire infrastructure life cycle, coupled with its continuous product innovation and commitment to security, positions it favorably to address the evolving needs of organizations in the cloud era.

(image/jpeg; 9.33 MB)

Google introduces Gemini 2.5 reasoning models 29 Mar 2025, 1:15 am

Google has introduced version 2.5 of its Gemini AI model, which the company said offers a new level of performance by combining an enhanced base model with improved post-training.

The Gemini 2.5 models are thinking models, capable of reasoning through their thoughts before responding, thus resulting in better performance and improved accuracy, said Koray Kavukcuoglu, CTO of Google DeepMind, in a March 25 blog post. The first Gemini 2.5 release is an experimental version of Gemini 2.5 Pro, which Kavukcuoglu said is state-of-the-art across a range of benchmarks that require advanced reasoning. Available now, Gemini 2.5 Pro Experimental demonstrates strong reasoning and code capabilities, leading on common coding, math, and science benchmarks, he said.

Gemini 2.5 Pro is available now in Google AI Studio and the Gemini app for Gemini Advanced users and will be coming to Vertex AI soon. Gemini 2.5 now ships with a one-million token context window, with a two-million context window due soon. Moving forward, plans call for building Gemini 2.5 thinking capabilities directly into all of Google’s models, enabling the handling of more complex problems and supporting more capable, context-aware agents.

(image/jpeg; 10.64 MB)

ECMAScript 2025 JavaScript standard takes shape 29 Mar 2025, 12:07 am

ECMAScript 2025, the next version of an ECMA International standard for JavaScript, will standardize new JavaScript capabilities ranging from JSON modules to import attributes, new Set methods, sync iterator helpers, and regular expression modifiers.

The ECMAScript 2025 specification most likely be finalized in June. All told, nine finished proposals on the ECMAScript development committee’s GitHub page are designated as expected to be published this year. Another proposal slated for 2025, for time duration formatting objects, appears on a different page. Development of ECMAScript is under the jurisdiction of the ECMA International Technical Committee 39 (TC39).

For JSON modules, the proposal calls for importing JSON files as modules. This plan builds on the import attributes proposal to add the ability to import a JSON module in a common way across JavaScript environments.

For regular expressions, meanwhile, the regular expression escaping proposal is intended to address a situation in which developers want to build a regular expression out of a string without treating special characters from the string as special regular expression tokens, while the regular expression pattern modifiers provides the capability to control a subset of regular expression flags with a subexpression. Modifiers are especially helpful when regular expressions are defined in a context where executable code cannot be evaluated, such as a JSON configuration file of a Textmate language grammar file, the proposal states. Also in the “regex” vein, the duplicate named capturing groups proposal allows regular expression capturing group names to be repeated. Prior to this proposal, named capturing groups in JavaScript were required to be unique.

The sync iterator helpers proposal involves having several interfaces to help with general usage and consumption of iterators in ECMAScript. Iterators are a way to represent large or possibly infinite enumerable data sets.

Other proposals lined up for ECMAScript 2025 include:

DurationFormat objects, an ECMAScript API specification proposal. Motivating this proposal is that users need all types of time duration formatting depending on the requirements of their application.
Specifications and a reference implementation for Promise.try, which allows optimistically synchronous but safe execution of a function, and being able to work with a Promise afterward. It mirrors the async function.
Float 16 on TypedArrays, DataView, and Math.f16round, which adds float16 (aka half-precision or binary16) TypedArrays to JavaScript. This plan would add a new kind of TypedArray, Float16Array, to complement the existing Float32Array and Float64Array. It also would add two new methods on DataView for reading and setting float16 values, as getFloat16 and setFloat16, to complement the existing similar methods for working with full and double precision floats. Also featured is Math.f16round, to complement the existing Math.fround. Among the benefits of this proposal is its usefulness for GPU operations.
Import attributes, which provide syntax to import ECMAScript modules with assertions. An inline syntax for module import statements would pass on more information alongside the module specifier. The initial application for these attributes will be to support additional types of modules across JavaScript environments, beginning with JSON modules.
Set methods for JavaScript, which add methods like union and intersection to JavaScript’s built-in Set class. Methods to be added include Set.prototype.intersection(other), Set.prototype.union(other), Set.prototype.difference(other), Set.prototype.symmetricDifference(other), Set.prototype.isSubsetOf(other), Set.prototype.isSupersetOf(other), Set.prototype.isDisjointFrom(other). These methods would require their arguments to be a Set, or at least something that looks like a Set in terms of having a numeric size property as well as keys and has methods.

The development of the ECMAScript language specification started in November 1996, based on several originating technologies including JavaScript and Microsoft’s JScript. Last year’s ECMAScript 2024 specification included features such as resizing and transferring ArrayBuffers and SharedArrayBuffers and more advanced regular expression features for working with sets of strings.

(image/jpeg; 0.06 MB)

Thread-y or not, here’s Python! 28 Mar 2025, 10:00 am

There’s more than one way to work with threads, or without them, in Python. In this edition of the Python Report: Get the skinny on Python threads and subprocesses, use Python’s native async library to break up non-CPU-bound tasks, and get started using parallel processing for heavier jobs in your Python programs. Also, check out the built-in async features in Python 3.13 if you haven’t already.

Environmental and economic impact

As I look at graphics that show the number of LLMs, I can’t help but consider the impact at a time when resources are becoming finite. Training alone can cost up to $5 million for flagship models, and the ongoing operational expenses reach millions per month.

Many people and organizations don’t yet realize the staggering environmental impact of AI. Training a single LLM requires enormous computational resources—the equivalent of powering several thousand homes for a year. The carbon footprint of training just one major model can equal the annual emissions of 40 cars or approximately 200 tons of carbon dioxide when using traditional power grids. Inference, which involves generating outputs, is less resource intensive but grows quickly with use, resulting in annual costs of millions of dollars and significant energy consumption measured in gigawatt centers.

The numbers become even more concerning when we look at the scale of current operations. Modern LLMs require hundreds of billions of parameters for training. GPT-3 uses 175 billion, BLOOM operates with 176 billion, and Google’s PaLM pushes this to 500 billion parameters. Each model requires hundreds of thousands of GPU hours for training, consuming massive amounts of electricity and requiring specialized hardware infrastructure.

Computational demands directly translate into environmental impact due to energy consumption and the hardware’s carbon footprint. The location of training facilities significantly affects this impact—models trained in regions that rely on fossil fuels can produce up to 50 times more emissions than those powered by renewable energy sources.

Too much duplication

Some level of competition and parallel development is healthy for innovation, but the current situation appears increasingly wasteful. Multiple organizations are building similar capabilities, with each contributing a massive carbon footprint. This redundancy becomes particularly questionable when many models perform similarly on standard benchmarks and real-world tasks.

The differences in capabilities between LLMs are often subtle; most excel at similar tasks such as language generation, summarization, and coding. Although some models, like GPT-4 or Claude, may slightly outperform others in benchmarks, the gap is typically incremental rather than revolutionary.

Most LLMs are trained on overlapping data sets, including publicly available internet content (Wikipedia, Common Crawl, books, forums, news, etc.). This shared foundation leads to similarities in knowledge and capabilities as models absorb the same factual data, linguistic patterns, and biases. Variations arise from fine-tuning proprietary data sets or slight architectural adjustments, but the core general knowledge remains highly redundant across models.

Consequently, their outputs often reflect the same information frameworks, resulting in minimal differentiation, especially for commonly accessed knowledge. This redundancy raises the question: Do we need so many similarly trained LLMs? Moreover, the improvements from one LLM version to the next are marginal at best—all the data has already been utilized for training, and our capacity to generate new data organically won’t produce significant improvements.

Slow down, please

A more coordinated approach to LLM development could significantly reduce the environmental impact while maintaining innovation. Instead of each organization building from scratch, we could achieve similar capabilities with far less environmental and economic cost by sharing resources and building on existing open source models.

Several potential solutions exist:

Create standardized model architectures that organizations can use as a foundation.
Establish shared training infrastructure powered by renewable energy.
Develop more efficient training methods that require fewer computational resources.
Implement carbon impact assessments before developing new models.

I use LLMs every day. They are invaluable for research, including research for this specific article. My point is that there are too many of them, and too many do mostly the same thing. At what point do we figure out a better way?

(image/jpeg; 2.29 MB)

Adobe announces AI agents for customer interaction 27 Mar 2025, 11:04 pm

Adobe has announced Adobe Experience Platform Agent Orchestrator, an intelligent reasoning engine that allows AI agents to perform complex decision-making and problem-solving tasks to accelerate customer experience orchestration. The Agent Orchestrator and a suite of Experience Platform Agents are “coming soon and currently in development,” Adobe said.

Announced March 18, Agent Orchestrator is rooted in semantic understanding of enterprise data, content, and customer journeys, Adobe said. This enables agentic AI solutions that are purpose-built for businesses to deliver targeted experiences with built-in data governance and regulatory compliance. Working through Adobe Experience Cloud, Adobe Experience Platform is used by companies to connect real-time data across an organization, with insights for customer experiences. With Agent Orchestrator, businesses can build and manage AI agents from Adobe and third-party ecosystems. In conjunction with Agent Orchestrator, Adobe announced 10 purpose-built AI agents, including:

Account Qualification Agent, for evaluating opportunities to build sales pipeline and engage members of a buying group.
Audience Agent, for analyzing cross-channel engagement data and create high-value audience segments.
Content Production Agent, for helping marketers scale by generating and assembling content.
Data Insight Agent, for simplifying the process of deriving insights from signals across an organization, for visualizing, forecasting, and remediating customer experiences.
Data Engineering Agent, for supporting high-volume data management tasks such as data integration, cleansing, and security.
Experimentation Agent, for hypothesizing and simulating new ideas and conducting impact analysis.
Journey Agent, for orchestrating cross-channel experiences.
Production Advisor Agent, for supporting brand engagement and funneling advancement through product discovery and consideration experiences tailored to individual preferences and past purchases.
Site Optimization Agent, for driving performant brand websites by detecting and fixing issues to improve customer engagement.
Workflow Optimization Agent, for supporting cross-team collaboration by monitoring the health of ongoing projects, streamlining approvals, and accelerating workflows.

Adobe also introduced Brand Concierge, a brand-centric agent built on top of Agent Orchestrator. Users will be able to configure and manage AI agents that guide consumers from exploration to confident purchasing decisions, using immersive and conversational experiences, Adobe said.

(image/jpeg; 12.27 MB)

How RamaLama helps make AI model testing safer 27 Mar 2025, 10:00 am

Consider this example: An amazing new software tool emerges suddenly, turning technology industry expectations on their heads by delivering unprecedented performance at a fraction of the existing cost. The only catch? Its backstory is a bit shrouded in mystery and it comes from a region that is, for better or worse, in the media spotlight.

If you’re reading between the lines, you of course know that I’m talking about DeepSeek, a large language model (LLM) that uses an innovative training technique to perform as well as (if not better than) similar models for a purported fraction of the typical training cost. But there are well-founded concerns around the model, both geopolitical (the startup is China-based) and technological (Was its training data legitimate? How accurate is that cost figure?).

Some might say that the various concerns around DeepSeek, many of which start on the privacy side of the coin, are overblown. Others, including organizations, states, and even countries, have banned downloads of DeepSeek’s models.

Me? I just wanted to test the model’s crazy performance claims and understand how it works—even if it had bias, even if it was kind of weird, even if it was indoctrinating me into its subversive philosophy (that’s a joke, people). I was willing to take the risk to see how DeepSeek’s advances might be used today and influence AI moving forward. With that said, I certainly didn’t want to download DeepSeek to my phone or to any other network-connected device. I didn’t want to sign up to their service, give them my credentials, or leak my prompts to a web service.

So, I decided to run the model locally using RamaLama.

Spinning up DeepSeek with RamaLama

RamaLama is an open source project that facilitates local management and serving of AI models through the use of container technology. The RamaLama project is all about reducing friction in AI workflows. By using OCI containers as the foundation for deploying LLMs, RamaLama aims to mitigate or even eliminate issues related to dependency management, environment setup, and operational inconsistencies.

Upon launch, RamaLama inspects your system for GPU support. If no GPUs are detected it falls back to CPUs. RamaLama then uses a container engine such as Podman or Docker to download an image that includes all of the software necessary to run an AI model for your system’s setup. Once the container image is in place, RamaLama pulls the specified AI model from a model registry. At this point, it launches a container, mounts the AI model as a data volume, and starts either a chatbot or a REST API endpoint (depending on what you want).

A single command!

That part still makes me super-excited. So excited, in fact, that I recently sent an email to some of my colleagues encouraging them to try it for themselves as a way to (more safely and easily) test DeepSeek.

Here, for context, is what I said:

I want to show you how easy it is to test deepseek-r1. It’s a single command. I know nothing about DeepSeek, how to set it up. I don’t want to. But I want to get my hands on it so that I can understand it better. RamaLama can help!

Just type:
ramalama run ollama://deepseek-r1:7b
When the model is finished downloading, type the same thing you typed with granite or merlin and you can compare how they perform by looking at their results. It’s interesting how DeepSeek tells itself what to include in the story before it writes the story. It’s also interesting how it confidently says things that are wrong 🙂

What DeepSeek thinks

I included in my email the results of a query I entered into DeepSeek. I asked it to write a story about a certain open source-forward software company.

DeepSeek returned an interesting narrative, not all of which was accurate, but what was really cool was the way that DeepSeek “thought” about its own “thinking” — in an eerily human and transparent way. Before generating the story, DeepSeek—which, like OpenAI o1, is a reasoning model—spent a few moments muddling through how it would put the story together. And it showed its thinking, 760-plus words’ worth. For example, it reasoned that the story should have a beginning, a middle, and an end. It should be technical, but not too technical. It should talk about products and how they are being used by businesses. It should have a positive conclusion, and so on.

This process was like a writer and editor talking through a story. Or healthcare professionals collaborating on a patient’s care plan. Or development and security teams discussing how to work together to protect an application. I can see DeepSeek being used as a tool in these and other collaborations, but I certainly don’t want it to replace them.

Indeed, based on my trial run of DeepSeek with RamaLama, I determined that I would feel comfortable using the LLM for tasks such as generating config files or in situations where inputs and outputs are pretty well packaged up—like, “Hey, analyze this cluster and tell me if you know whether Kubernetes is healthy.” However, the glaring hallucinations in DeepSeek’s narrative about the aforementioned open source company led me to determine that DeepSeek should not be considered the supreme authority for any kind of open-ended questions whose answers have impactful ramifications.

And, honestly, I would say that today about any public LLM.

The value of RamaLama

I think that’s where the value proposition of RamaLama comes in. You can do this kind of testing and iterating on AI models without compromising your own data. When you’re done running the model locally, it can just be deleted. This is something that Ollama also does, but RamaLama’s ability to containerize models provides portability across runtimes and the ability to leverage existing infrastructure (including container registries and CI/CD workflows). RamaLama also optimizes software for specific GPU configurations and generates a Podman Quadlet file that makes it easier for developers eventually to go from experimentation to production.

These kinds of capabilities will be increasingly important as more companies invest more time, money, and trust in AI.

Indeed, DeepSeek has a plethora of potential issues but it has challenged conventional wisdom and therefore has the potential to move AI thinking and applications forward. Curiosity mixed with a healthy dose of caution should drive our work with new technology, so it will be important to continue to use and develop safe spaces such as RamaLama.

—

Generative AI Insights provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss the challenges and opportunities of generative artificial intelligence. The selection is wide-ranging, from technology deep dives to case studies to expert opinion, but also subjective, based on our judgment of which topics and treatments will best serve InfoWorld’s technically sophisticated audience. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Contact doug_dineley@foundryco.com.

(image/jpeg; 1.14 MB)

What next for WASI on Azure Kubernetes Service? 27 Mar 2025, 10:00 am

Microsoft announced at the end of January 2025 that it would be closing its experimental support for WASI (WebAssembly System Interface) node pools in its managed Azure Kubernetes Service. This shouldn’t have been a surprise if you have been following the evolution of WASI on Kubernetes. The closure does require anyone using server-side WASI code on AKS to do some work as part of migrating to an alternate runtime.

It’s important to note that the two options Microsoft is suggesting don’t mean migrating away from WASI. WebAssembly and Kubernetes are two technologies that work well together. As a result, several different open source projects fill the gap, allowing you to add a new layer to your AKS platform and ensuring that you can continue running with minimal disruption.

If you’re using WASI node pools in AKS, the last day you can create a new one is May 5. You can continue using existing WASI workloads, but it’s time to look at alternatives for new developments and upgrades. You shouldn’t wait until Microsoft’s own WASI AKS service stops working; you can start planning your transition now, with official support for two alternative approaches.

From Krustlets to what?

The big issue for AKS WASI node pools was its dependency on the experimental Krustlet project, which used Rust to build WebAssembly-ready Kubelets. Unfortunately, even though Krustlet was a Cloud Native Computing Foundation Sandbox project, it’s no longer maintained, as team members have moved on to other projects. With no maintainers, the project would be left behind as both Kubernetes and WebAssembly continued to evolve.

With it no longer possible to rely on a key dependency, it’s clear that Microsoft had no choice but to change its approach to WebAssembly in AKS. Luckily for Microsoft, with AKS offering a managed way to work with Kubernetes, it still supports the wider Kubernetes ecosystem via standard APIs. That allows it to offer alternate approaches to running WASI on its platform.

Run WebAssembly functions on AKS with SpinKube

One option is to use another WASI-on-Kubernetes project’s runtime. SpinKube has been developing a shim for the standard Kubernetes container host, containerd, which lets you use runwasi to host WASI workloads without needing to change the underlying Kubernetes infrastructure. Sponsored by WebAssembly specialist Fermyon, Spin is part of a long heritage of Kubernetes tools from a team that includes the authors of Helm and Brigade.

SpinKube is a serverless computing platform that uses WASI workloads and manages them with Kubernetes. Its containerd-shim-spin tool adds runwasi support, so your nodes can host WASI code, treating it as standard Kubernetes resources. Nodes host a WASI runtime and are labeled to ensure that your workloads are scheduled appropriately, allowing you to run both WASI and standard containers at the same time, as well as tools like KEDA (Kubernetes Event-driven Autoscaling) for event-driven operations.

Other Spin tools handle deploying and managing the life cycle of shims, ensuring that you’re always running an up-to-date version and that the containerd shim is deployed as part of an application deployment. This allows you to automate working with WASI workloads, and although this requires more management than the original WASI node pools implementation, it’s a useful step away from having to do everything through the command line and kubectl.

Microsoft recommends SpinKube as a replacement for its tool and provides instructions on how to use it in AKS. You can’t use it with a Windows-based Kubernetes instance, so make sure you have a Linux-based AKS cluster to work with it. Usefully you don’t need to start from scratch as you can deploy SpinKube to existing AKS clusters. This approach ensures that you can migrate to SpinKube-based WASI node pools and keep running with Microsoft’s own tools until you’ve finished updating your infrastructure.

Deploying SpinKube on AKS

Although it’s technically available through the Azure Marketplace, most of the instructions for working with SpinKube and AKS are based on you installing it from the platform’s own repositories, using Kubernetes tools. This may be a more complex approach, but it’s more in line with how most Kubernetes operators are installed, and you’re more likely to get community support.

You will need the Azure CLI to deploy SpinKube after you have created an AKS cluster. This is where you run the Kubernetes kubectl tools, using your AKS credentials. Your cluster will need to be running cert-manager, which can be deployed using Helm. Once that’s in place, follow up by installing SpinKube’s runtime-class-manager and its associated operator. This can be found in its own repository under its original name, KWasm.

You can now deploy the containerd shim to your cluster via kubectl, using the annotate node command. This informs the runtime-class-manager to deploy the shim, labeling nodes that are ready to use. You now can add the SpinKube custom resource definitions and runtime classes to the cluster, using kubectl to copy the spin components from GitHub and apply them to your cluster. Once these are in place, you will use Helm to deploy the spin-operator before adding a SpinAppExecutor.

Getting up and running is a relatively complex set of steps, however, you do have the choice of wrapping the entire deployment process in an Azure CLI script. This will allow you to automate the process and repeat it across application instances and Azure regions.

Once the SpinKube nodes are in place, you can bring your WASI applications across to the new environment. Spin is designed to load WASI code from an OCI-compliant registry, so you will need to set one up in your Azure infrastructure. You also have the option of using a CI/CD integrated registry like the one included as part of the GitHub Packages service. If you take this route, you should use a GitHub Enterprise account to keep your registry private.

With this approach, you can compile code to WASI as part of a GitHub Action, using the same Action to save it in the repository. Your AKS application will always have access to the latest build of your WASI modules. Like all Kubernetes applications, you will need to define a YAML description for your code and use the containerd shim as the executor for your code.

Use WASI microservices on AKS with wasmCloud

As an alternative to SpinKube, you can use another CNCF project, wasmCloud. Here you need a Helm chart to install the various wasmCloud components at the same time. This requires having the Azure CLI and kubectl manage AKS, as there’s no integration with the Azure Portal. At the same time, as there is quite a different architectural approach, you need to start from scratch, rearchitecting your cluster and application for use with wasmCloud.

Start by creating a Kubernetes namespace for wasmCloud, before using Helm to install the wasmCloud platform components. Once the pods have restarted, use kubectl to start a wasmCloud instance and then deploy the components that make up your application. WasmCloud has its own command-line management tool, and you need to forward traffic to the management pod to use it.

Again, you must use YAML to describe your application; however, now you’re using wasmCloud’s own orchestration tools, so you will use its descriptions of your application components. Once complete, you can use the command-line tool to deploy and run the application. WasmCloud is designed to support a component model for building and running applications, with the intent of delivering a standard way of describing and calling WASI components, with support from Cosmonic.

A philosophical difference

With two alternatives for Microsoft’s WASI node pools, it’s clear there’s still a future for WebAssembly on AKS. But why two quite different ways of working with and running WASI?

The underlying philosophies of wasmCloud and SpinKube are very different; wasmCloud is designed to host full-scale WASI-based applications, assembling and orchestrating microservice components, while SpinKube is very much about quickly launching WASI-based functions, scaling from zero in very little time as an alternative to services like AWS Lambdas or Azure Functions. Having support for both in AKS makes sense so you can choose the right WASI platform for your needs.

We’re still exploring what WebAssembly can do for us, so it’s good we’re not being locked into only one way to work with it in AKS. Having many different options is very much the Kubernetes way of doing things, as it’s a way to build and manage distributed applications. Like our PCs and servers, it’s a flexible and powerful platform ready for your applications, whether they’re serverless functions or large-scale enterprise services, written in whatever language, and hosted in whatever containerd-compliant environment you want.

(image/jpeg; 0.78 MB)

Microsoft lauds Hyperlight Wasm for WebAssembly workloads 26 Mar 2025, 10:40 pm

Microsoft has unveiled Hyperlight Wasm, a virtual machine “micro-guest” that can run WebAssembly component workloads written in a multitude of languages including C and Python.

Introduced March 26, Hyperlight Wasm serves as a Rust library crate. Wasm modules and components can be run in a VM-backed sandbox. The purpose of Hyperlight Wasm is to enable applications to safely run untrusted or third-party Wasm code within a VM with very low latency/overhead. It is built on Hyperlight, introduced last year as an open source Rust library to execute small, embedded functions using hypervisor-based protection. Workloads in the Hyperlight Wasm guest can run for compiled languages such as C, Go, and Rust as well as for interpreted languages including Python, JavaScript, and C#. But a language runtime must be included as part of the image.

Hyperlight Wasm remains experimental and is not considered production-ready by its developers, according to the project’s GitHub page. This page also contains instructions for building with the technology. Hyperlight Wasm takes advantage of WASI (WebAssembly System Interface) and the WebAssembly Component Model. It can allow developers to implement a small set of high-level, performant abstractions in almost any execution environment and provides a fast, hardware-protected but widely compatible execution environment.

Building Hyperlight with a WebAssembly runtime enables any programming language to execute in a protected Hyperlight micro-VM without any prior knowledge of Hyperlight. Program authors are just compiling for the wasm32-wasip2 target, meaning programs can use runtimes such as Wasmtime or Jco, Microsoft said. Programs also can be run on a server for Nginx Unit, Spin, WasmCloud, or, now, Hyperlight Wasm. In an ideal scenario, developers wouldn’t need to think about what runtime their code will run on as they are developing it. Also, by combining Hyperlight with WebAssembly, Microsoft said it was achieving more security and performance than traditional VMs by doing less work overall. Wasmtime provides strong isolation boundaries for Wasm workloads via a software-defined sandbox, Microsoft said.

Plans call for enabling Hyperlight Wasm to work on Arm64 processors. Thus far, planning has centered on using WASI on Hyperlight for portability between operating systems and VMs. But Wasm applications are portable between different instruction sets. Also, Hyperlight Wasm will soon be extended with default bindings for some WASI interfaces.

(image/jpeg; 1.35 MB)

Critical RCE flaws put Kubernetes clusters at risk of takeover 26 Mar 2025, 3:53 pm

The Kubernetes project has released patches for five vulnerabilities in a widely used popular component called the Ingress NGINX Controller that’s used to route external traffic to Kubernetes services. If exploited, the flaw could allow attackers to completely take over entire clusters.

“Based on our analysis, about 43% of cloud environments are vulnerable to these vulnerabilities, with our research uncovering over 6,500 clusters, including Fortune 500 companies, that publicly expose vulnerable Kubernetes ingress controllers’ admission controllers to the public internet — putting them at immediate critical risk,” wrote researchers from cloud security firm Wiz who found and reported the flaws.

Continue reading on CSOonline.com

(image/jpeg; 0.61 MB)

Databricks’ TAO method to allow LLM training with unlabeled data 26 Mar 2025, 1:21 pm

Data lakehouse provider Databricks has unveiled a new large language model (LLM) training method, TAO that will allow enterprises to train models without labeling data.

Typically, LLMs when being adapted to new enterprise tasks are trained by using prompts or by fine-tuning the model with datasets for the specific task.

However, both these techniques have caveats. While prompting is seen as an error-prone process with limited quality gains, fine-tuning requires large amounts of human-labeled data which is either not available for most enterprises or is extremely time-consuming to actually label the data.

TAO or Test-time Adaptive Optimization, according to Databricks, provides an alternative to fine-tuning model by leveraging test-time compute and reinforcement learning (RL) to teach a model to do a task better based on past input examples alone, meaning that it scales with an adjustable tuning compute budget, not human labeling effort.

Test-time compute, which has gained popularity due to its use by OpenAI and DeepSeek across their o1 and R1 models, is the compute resources that any LLM uses during the inference phase, which is when it is being asked to complete a task and not during training.

These compute resources, which focus on how the model is actually reasoning to solve a task or query, can be used to make adjustments to improve output quality, according to a community post on Hugging Face.

However, Databricks’ Mosaic Research team has pointed out that enterprises don’t need to be alarmed about the rise in inference costs if they were to adopt TAO.

“Although TAO uses test-time compute, it uses it as part of the process to train a model; that model then executes the task directly with low inference costs (i.e., not requiring additional compute at inference time),” the team wrote in a blog post.

Mixed initial response to TAO

Databricks’ co-founder CEO Ali Ghodsi’s post about TAO on LinkedIn has attracted mixed initial response to TAO.
While some users, such as Iman Makaremi, co-founding head of AI at Canadian starup Catio; and Naveed Ahamed, senior enterprise architect at Allianz Technology, were excited to implement and experiment with TAO, other users posed questions about the efficiency of TAO.

Tom Puskarich, a former senior account manager at Databricks, questioned the use of TAO when training a model for new tasks.

“If you are upgrading a current enterprise capability with a trove of past queries, but for enterprises looking to create net new capabilities, wouldn’t a training set of labeled data be important to improve quality?” Puskarich wrote.

“I love the idea of using inputs to improve but most production deployments don’t want a ton of bad experiences at the front end while the system has to learn,” the senior account manager added.

Another user — Patrick Stroh, head of Data Science and AI at ZAP Solutions pointed out that enterprise costs may increase.

“Very interesting, but also cognizant of the (likely increase) costs due to an adaptation phase. (This would likely be incremental to the standard costs (although still less than fine-tuning)). (I simply can’t understand how it would the SAME as the original LLM as noted given that adaptation compute. But I suppose they can price it that way.),” Stroh wrote.

How does TAO work?

TAO comprises four stages including response generation, response scoring, reinforcement learning, and continuous improvement.

In the response generation stage, enterprises can begin with collecting example input prompts or queries for a task, which can be automatically collected from any AI application using its proprietary AI Gateway.

Each prompt is then used to generate a diverse set of candidate responses and then these responses are systematically evaluated in the response scoring stage for quality, the company explained, adding that scoring methodologies include a variety of strategies, such as reward modeling, preference-based scoring, or task-specific verification utilizing LLM judges or custom rules.

In the reinforcement learning stage, the model is updated or tuned so that it produces outputs more closely aligned with high-scoring responses identified in the previous step.

“Through this adaptive learning process, the model refines its predictions to enhance quality,” the company explained.

And finally in the continuous improvement phase, enterprise users create data, which are essentially different LLM inputs, by interacting with the model, which can be used to optimize model performance further.

TAO can increase the efficiency of inexpensive models

Databricks said it used TAO to not only achieve better model quality than fine-tuning but also upgrade the functionality of inexpensive open-source models, such as Llama, to meet the quality of more expensive proprietary models like GPT-4o and o3-mini.

“Using no labels, TAO improves the performance of Llama 3.3 70B by 2.4% on a broad enterprise benchmark,” the team wrote.

TAO is now available in preview to Databricks customers who want to tune Llama, the company said. The company is planning to add TAO to other products in the future.

(image/jpeg; 0.11 MB)

Page processed in 1.561 seconds.

Understand Python’s new lock file format | InfoWorld