SimplePie: Demo

Or try one of the following: 詹姆斯.com, adult swim, Afterdawn, Ajaxian, Andy Budd, Ask a Ninja, AtomEnabled.org, BBC News, BBC Arabic, BBC China, BBC Russia, Brent Simmons, Channel Frederator, CNN, Digg, Diggnation, Flickr, Google News, Google Video, Harvard Law, Hebrew Language, InfoWorld, iTunes, Japanese Language, Korean Language, mir.aculo.us, Movie Trailers, Newspond, Nick Bradbury, OK/Cancel, OS News, Phil Ringnalda, Photoshop Videocast, reddit, Romanian Language, Russian Language, Ryan Parman, Traditional Chinese Language, Technorati, Tim Bray, TUAW, TVgasm, UNEASYsilence, Web 2.0 Show, Windows Vista Blog, XKCD, Yahoo! News, You Tube, Zeldman

Technology insight for the enterprise

AWS updates Amazon Bedrock’s Data Automation capability 28 Apr 2025, 11:41 am

AWS has updated the Data Automation capability inside its generative AI service Amazon Bedrock to further support the automation of generating insights from unstructured data and bring down the development time required for building applications underpinned by large language models (LLMs).

Bedrock’s Data Automation, according to AWS, is targeted at developers who can use the capability to accelerate the development of generative AI-driven applications by helping build components or workflows, such as automated data analysis for insights, in a simplified manner.

AWS has integrated Data Automation with Bedrock’s Knowledge Bases capability to help developers extract information from unstructured multimodal data and use it as context for retrieval augmented generation (RAG) use cases.

The latest update to Data Automation includes support for modality enablement, modality routing by file type, extraction of embedded hyperlinks when processing documents, and an increased overall document page limit of 3,000 pages.

“These new features give you more control over how your multimodal content is processed and improve Bedrock Data Automation’s overall document extraction capabilities,” AWS wrote in a blog post.

AWS said that enterprise developers can use the modality enablement feature to configure which modalities — image, document, audio, and video — will be processed amongst all data for a particular project or application.

Developers also have the choice to route specific file types as modalities, and what that means is developers will be able to process JPEG or JPG files as documents, and MP4 or M4V files as video files instead of their original image or audio type via Data Automation.

Another feature that has been added to Data Automation is the embedding of hyperlinks found in PDFs as part of the output or insights generated.

“This feature enhances the information extraction capabilities from documents, preserving valuable link references for applications such as knowledge bases, research tools, and content indexing systems,” the cloud services provider wrote.

Additionally, AWS has also increased support for processing documents, up to 3,000 pages per document from 1,500 pages per document, in Bedrock Data Automation.

The increased limit will enable developers to process larger documents without splitting them, the cloud services provider said, adding that this also simplifies workflows for enterprises dealing with long documents or document packets.

Currently, Amazon Bedrock Data Automation is generally available in the US West (Oregon) and US East (Northern Virginia) regions.

(image/jpeg; 9.45 MB)

14 tiny tricks for big cloud savings 28 Apr 2025, 11:00 am

When the cost of cloud computing is listed in cents or even fractions of a cent, it’s hard to remember that even small numbers can add up to big bills. Yet they do, and every month it seems CFOs come close to dying of multiple heart attacks when the cloud computing bill arrives.

To save the health of these bean counters, and also the necks of the engineers on the receiving end of their ire, here’s a list of small ways organizations can save money on the cloud. None of these tricks is likely to lead to big savings on its own, but together they can add up—or should we say subtract down?—to lower the overall cloud bill.

Shut down development clusters when they’re not in use

Some developers work eight hours a day, and some work more. But it’s rare for anyone to use a development cluster for more than 12 hours a day for a sustained period. There are 168 hours in a week but if you and your team work only a quarter of those hours, it’s possible to save 75% on the cost of running your development clusters. Yes, shutting down big clusters can take time. Yes, some types of odd machines may be hard to spin up immediately. Consider writing scripts that run in the background and manage it all for you.

Smart mock your microservices

Many cloud applications are constellations of machines running microservices. Instead of firing up all your machines, you can employ a smart set of mock services to imitate machines that are not the focus of the daily work. Mock instances of microservices can significantly shrink what is required to test new code. Smart developers can often configure these mock instances to offer better telemetry for debugging by tracking all the data that comes their way.

Cap local disk storage

Many cloud instances come with standard disks or persistent storage with default sizes. This can be some of the most expensive disk space for your computation, so it makes sense to limit how much you assign to your machines. Instead of choosing the default, try to get by with as little as possible. This may mean clearing caches or deleting local copies of data after it’s safely stored in a database or object storage. In other words, try to build very lightweight versions of your servers that don’t need much local storage.

Right-size cloud instances

Good algorithms can boost the size of your machine when demand peaks. But clouds don’t always make it easy to shrink all the resources on disk. If your disks grow, they can be hard to shrink. By monitoring these machines closely, you can ensure that your cloud instances consume only as much as they need and no more.

Choose cold storage

Some cloud providers include services for storing data that does not need fast access. AWS’s Glacier and Scaleway, for instance, charge a very low price but only if you accept a latency of several hours or more. It makes sense to carefully migrate cold data to these cheaper locations. In some cases, security could be another argument for choosing this option. Scaleway boasts of using a former nuclear fallout shelter to physically protect data.

Choose cheaper providers

Some competitors offer dramatically lower prices for services like object storage. Wasabi claims to offer prices that are 80% less than the competition. Backblaze says its services cost one-fifth of what you might pay elsewhere. Those are big savings. These services also compete on access latency offering faster “hot storage” response times. Of course, you’ll still have to wait for your queries to travel over the general Internet instead of just inside one data center, but the difference can still be significant. Affordable providers also sometimes offer competitive terms for data access. Some cut out the egress fees, which can be valuable for data that is downloaded frequently.

Spot machines

Some cloud providers run auctions on spare machines and the price tags can be temptingly low. Because you can run tasks without firm deadlines when the spot price is low, spare machines are great for background work like generating monthly reports. On the other hand, it’s important to know these spot instances may be shut down without much warning. Applications that run on spare machines should be idempotent. It’s also worth noting that when demand is high, the spot prices can soar. Just think of using them as a bit of a financial adventure.

Reserved instances

Cloud providers can offer significant discounts for organizations that make a long-term commitment to using hardware. These are sometimes called reserved instances, or usage-based discounts. They can be ideal when you know just how much you’ll need for the next few years. The downside is that the commitment locks in both sides of the deal. You can’t just shut down machines in slack times or when a project is canceled.

Be transparent

Engineers are pretty good at solving problems, especially numerical ones, and in the end, cloud cost is just another metric to optimize. Many teams leave the cloud costs to some devops pro who might have a monthly meeting with someone from finance. A better solution is to broadcast the spending data to everyone on the team. Let them drill down into the numbers and see just where the cash is going. A good dashboard that breaks down cloud costs may just spark an idea about where to cut back.

Go serverless

The cloud computing revolution has always been about centralizing resources and then making it easy for users to buy only as much as they need. The logical extreme of this is billing by each transaction. The poorly named “serverless” architecture is a good example of saving money by buying only what you need. A friend of mine brags that one of his side hustles costs him only 3 cents per month but one day he hopes it will go viral and the bills will spiral into the tens or even hundreds of dollars. Businesses with skunk work projects or proofs of concept love these options because they can keep computing costs quite low until the demand arrives.

Store less data

Programmers like to keep data around in case they might ever need it again. That’s a good habit until your app starts scaling and it’s repeated a bazillion times. If you don’t call the user, do you really need to store their telephone number? Tossing personal data aside not only saves storage fees but limits the danger of releasing personally identifiable information. Stop keeping extra log files or backups of data that you’ll never use again.

Store data locally

Many modern browsers make it possible to store data in object storage or even a basic version of a classic database. The WebStorage API offers a simple key-value store while the IndexedDB stores hierarchical tables and indexes them too. Both solutions were intended to be smart, local caches for building more sophisticated web applications that also responded quickly without overloading the network connection. But they can also be used to save storage costs. If the user wants to save endless drafts, well, maybe they can pay for it themselves.

Move the work elsewhere

While many cloud providers charge the same no matter where you store your data, some are starting to change the price tag based on location. AWS, for instance, charges $0.023 per gigabyte in Northern Virginia but $0.026 in Northern California for S3 storage. Alibaba recently cut its prices in offshore data centers much more than the onshore ones. Location matters quite a bit in these examples. Unfortunately, it may not be easy to take advantage of these cost savings for large blocks of data. Some cloud providers have exfiltration charges for moving data between regions. Still, it’s a good idea to shop around when setting up new programs.

Offload cold data

Cutting back on some services will save money, but the best way to save cash is to go cold turkey. There’s nothing stopping you from dumping your data into a hard disk on your desk or down the hall in a local data center. Hard disk prices can be just above $10 per terabyte for a new hard disk or below $7 for a used disk. And that’s not a monthly price or even an annual one; it’s for as long as the disk keeps spinning. Of course, you only get that price in return for taking on all the responsibility and the cost of electricity. It won’t make sense for your serious workloads, but the savings for not-so-important tasks like very cold backup data can be significant. You might also note some advantages in cases where compliance rules favor having physical control of the data.

(image/jpeg; 9.25 MB)

Conquering the costs and complexity of cloud, Kubernetes, and AI 28 Apr 2025, 11:00 am

Platform engineering teams are at the forefront of enterprise innovation, leading initiatives in cloud computing, Kubernetes, and AI to drive efficiency for developers and data scientists. However, these teams face mounting challenges in managing costs and complexity across their expanding technological landscape. According to industry research conducted by my company, Rafay Systems, 93% of teams face hurdles in Kubernetes management, with cost visibility and complex cloud infrastructure cited as top challenges for organizations.

While IT leaders clearly see the value in platform teams—nine in 10 organizations have a defined platform engineering team—there’s a clear disconnect between recognizing their importance and enabling their success. This gap signals major stumbling blocks ahead that risk derailing platform team initiatives if not addressed early and strategically. For example, platform teams find themselves burdened by constant manual monitoring, limited visibility into expenses, and a lack of standardization across environments. These challenges are only amplified by the introduction of new and complex AI projects. There’s a pressing need for solutions that balance innovation with cost control so that platform teams can optimize resources efficiently without stunting modernization.

The problem with platform team tools

Let’s zoom out a bit. The root cause of platform teams’ struggles with Kubernetes cost visibility and control often traces back to their reliance on tools that are fundamentally misaligned with modern infrastructure requirements.

Legacy cost monitoring tools often fall short due to a multitude of reasons:

They lack the granular visibility needed for cost allocation across complex containerized environments.
They weren’t designed for today’s multi-team, multi-cloud architectures, creating blind spots in resource tracking.
Their limited visibility often results in budget overruns and inefficient resource allocation.
They provide inadequate cost forecasting and budgeting.

Our research shows that almost a third of organizations underestimate their total cost of ownership for Kubernetes, and that a lack of proper visibility into costs is a major hurdle for organizations. Nearly half (44%) of organizations reported that “providing cost visibility” is a key organizational focus for addressing Kubernetes challenges in the next year. And while standardization is essential for effective cost management and successful overall operational efficiency, close to 40% of organizations report challenges in establishing and maintaining enterprise-wide standardization—a foundational element for both cost control and operational efficiency.

Platform teams that manually juggle cost monitoring across cloud, Kubernetes, and AI initiatives find themselves stretched thin and trapped in a tactical loop of managing complex multi-cluster Kubernetes environments. This prevents them from driving strategic initiatives that could actually transform their organizations’ capabilities.

These challenges reflect the overall complexity of modern cloud, Kubernetes, and AI environments. While platform teams are chartered with providing infrastructure and tools necessary to empower efficient development, many resort to short-term patchwork solutions without a cohesive strategy. This creates a cascade of unintended consequences: slowed adoption, reduced productivity, and complicated AI integration efforts.

The AI complexity multiplier

The integration of AI and generative AI workloads adds another layer of complexity to an already challenging landscape, as managing computational costs and the resources it takes to train models introduces new hurdles. Nearly all organizations (95%) plan to increase Kubernetes usage in the next year, while simultaneously doubling down on AI and genAI capabilities. 96% of organizations say it’s important for them to provide efficient methods for the development and deployment of AI apps and 94% say the same for generative AI apps. This threatens to overwhelm platform teams even more if they don’t have the right tools and strategies in place.

As a result, organizations increasingly seek capabilities for GPU virtualization and sharing across AI workloads to improve utilization and reduce costs. The ability to automatically allocate AI workloads to appropriate GPU resources based on cost and performance considerations has become essential for managing these advanced technologies effectively.

Prioritizing automation and self-service

Our research reveals a clear mandate: Organizations must fundamentally transform how they approach infrastructure management to becoming enablers of self-service capabilities. According to our research, organizations are prioritizing proactive, automation-driven solutions such as automated cluster provisioning, standardized and automated infrastructure, and self-service experiences as top initiatives for developers.

Organizations are zeroing in on a range of cost management initiatives for platform teams over the next year, including:

Reducing and optimizing costs associated with Kubernetes infrastructure,
Visibility and showback into cloud and Kubernetes costs,
Providing chargeback to internal groups (finops).

The push toward automation and self-service represents more than just a technical evolution—it’s a fundamental shift in how organizations approach infrastructure management. Self-service automation allows developers to move quickly while maintaining guardrails for resource usage and cost control. At the same time, standardized infrastructure and automated provisioning help ensure consistent deployment practices across increasingly complex environments. The result is a more sustainable approach to platform engineering that can scale with organizational needs while keeping costs in check.

By investing in automation and self-service capabilities now, organizations can position their platform teams to handle future challenges more effectively, whether they come from new technologies, changing business needs, or evolving infrastructure requirements.

Empowering platform teams

Platform team challenges—from Kubernetes and multi-cloud management to generative AI implementation—are significant, but not insurmountable. Organizations that successfully navigate this landscape understand that empowering platform teams requires more than just acknowledging their importance. It highlights the need for robust, versatile tools and processes that enable effective cost management and standardization. Platform teams need comprehensive solutions that balance innovation with cost control, while optimizing resources efficiently without impeding modernization efforts. Empowered platform teams will be the key differentiator between organizations that survive and those that excel as the landscape continues to evolve with new challenges in cloud, Kubernetes, and AI.

Haseeb Budhani is co-founder and CEO of Rafay Systems.

—

New Tech Forum provides a venue for technology leaders—including vendors and other outside contributors—to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all inquiries to doug_dineley@foundryco.com.

(image/jpeg; 5.22 MB)

OpenSearch in 2025: Much more than an Elasticsearch fork 28 Apr 2025, 11:00 am

Open source has never been more popular. It’s also never been more contentious.

With hundreds of billions of dollars available to the companies that best turn open source into easy-to-use cloud services, vendors have fiercely competed for enterprise dollars. This has led to a spate of licensing changes from the companies that develop the open source software, and forks from the clouds that want to package and sell it. But something interesting is happening: These forks may start as clone wars, but they’re increasingly innovative projects in their own right. I’ve recently written about OpenTofu hitting its stride against Terraform, but OpenSearch, which has its big community event this week in Amsterdam, is an even bigger success story.

Born from the fire of Elastic’s 2021 license change, OpenSearch spent its first few years stabilizing and proving it could (and should) continue to exist. In the past year, OpenSearch has actively forged its own identity as a truly independent and innovative force in enterprise search, one that is quickly evolving to be much more than an Elasticsearch look-alike.

Moving beyond the fork

To understand OpenSearch’s recent path, a quick rewind is essential. In early 2021, Elastic ditched the Apache License (ALv2) for new Elasticsearch and Kibana versions, opting for the Server Side Public License (SSPL) and the Elastic License (ELv2). The goal? Keep AWS and other cloud vendors from offering Elasticsearch as a service without Elastic getting a cut. AWS, whose managed service relied heavily on the ALv2 codebase, responded swiftly, forking Elasticsearch 7.10.2 and Kibana 7.10.2. They stripped Elastic’s proprietary code and telemetry, launching the OpenSearch project under ALv2. It was a bold move but it left a lot of uncertainty: AWS didn’t have expertise in running a community-driven project, and only had a bit more experience managing its own open source projects (such as Firecracker).

Frankly, the odds weren’t great that AWS would succeed. And yet it has.

In 2023 I noted some of OpenSearch’s early successes as it expanded its community and won over some early customers. But it’s really the events in the past year that have demonstrated just how far AWS has come in learning how to contribute to open source in big ways, setting up OpenSearch as a serious contender in enterprise search.

Even though most open source projects have very limited contributor pools and often are the handiwork of a single developer (or a single company), it’s easier to attract volunteer contributors when a project sits within a neutral foundation. As such, AWS demonstrated how serious it was about OpenSearch’s open source success when it moved the project to the Linux Foundation in late 2024, establishing the OpenSearch Software Foundation (OSSF). This wasn’t just admin shuffling; it was strategic. Placing the project within a neutral foundation directly addressed concerns about AWS controlling the project. Suddenly the Technical Steering Committee (TSC) boasted representatives from SAP, Uber, Oracle, Bytedance, and others. Additionally, OpenSearch now can claim more than 1,400 unique contributors (over 350 active), hundreds of maintainers across dozens of organizations, and activity spanning more than 100 GitHub repositories by early 2025.Critically, the percentage of contributions and maintainers from outside AWS has significantly increased, signaling progress towards genuine diversification.

For AWS, whose Leadership Principles almost demand control over customer outcomes (“Deliver Results,” etc.), this is a revolutionary change in how it does business.

Getting better all the time

Clearly, OpenSearch is on the correct path. With governance solidifying, OpenSearch has pursued aggressive development, guided by a public road map, pushing beyond its roots to tackle modern data challenges, especially in AI/vector search and observability. OpenSearch has significantly moved beyond mere Elasticsearch compatibility. Driven by user needs, OpenSearch has added vector similarity search, hybrid search combining keyword and semantic methods, and built-in neural search capabilities. In 2024 alone, OpenSearch made major strides—adding integration with Facebook’s FAISS, SIMD hardware acceleration, and vector quantization for high-performance semantic searches.

Performance and scalability improvements have also been dramatic. Query speeds increased significantly (up to six times faster than early versions), thanks to extensive optimizations. New features, such as segment replication, have boosted data ingestion throughput by approximately 25%. Additionally, remote-backed storage now enables cost-efficient indexing directly into cloud object storage services, critical for enterprises dealing with petabyte-scale data sets.

This isn’t a community hoping to play catch-up. This is a strategic bid for leadership.

It’s one thing to write good code. It’s quite another to convince enterprises to use it. In this area, there’s growing evidence that OpenSearch is gaining enterprise ground. Just measuring use (without concern for whether it’s paid adoption), by the end of 2023 OpenSearch had surpassed 300 million cumulative downloads, clearly signaling mainstream adoption. AWS, for its part, touts “tens of thousands” of customers (which may be true, but that number includes users of older Elasticsearch versions). Although it’s hard to find public examples of large enterprises adopting OpenSearch, past and future OpenSearchCon events reveal LINE, Coursera, and other significant users (though most of the talks are still given by AWS employees). Job postings show that Fidelity Investments, Warner Bros, and others are OpenSearch users. Plus, a Linux Foundation report found 46% of surveyed users run OpenSearch as a managed service, indicating significant cloud uptake. High demand (87%) for better interoperability also suggests users see it as part of a broader stack.

The long shadow of Elasticsearch

Despite progress, OpenSearch faces challenges, primarily the constant comparison with Elasticsearch. For example, Elastic often claims performance advantages (40% to 140% faster). However, a March 2025 Trail of Bits benchmark comparing OpenSearch 2.17.1 and Elasticsearch 8.15.4 found OpenSearch faster overall on the “Big 5” workload and moderately faster in Vectorsearch (default engines), though results varied. Benchmarks are notoriously unreliable gauges; your mileage may vary.

Nor can OpenSearch still claim to be the open source alternative to Elasticsearch. In late 2024, Elastic added an AGPLv3 license option alongside SSPL and ELv2. Skeptics viewed this return to open source as a cynical response to OpenSearch’s momentum, but in my own conversations with Shay Banon, Elastic’s cofounder, the company had always wanted to return to an OSI-approved license: “I personally always wanted to get back to open source, even when we changed the license. We were hoping AWS would fork and would let us move back when enough time has passed.” Whatever the motivation, Elasticsearch is now just as open source as OpenSearch.

That comparison no longer really matters. OpenSearch has proven it’s more than AWS’ knee-jerk reaction to supply chain risk. OpenSearch is building its own identity, focused on next-gen workloads. Still, OpenSearch’s big challenge is still the process of converting its open governance and permissive licensing into an ecosystem that builds superior search to Elasticsearch or other competitors. There’s a long way to go, but its progress in the past few years, and particularly in 2024, suggests OpenSearch is here to stay—and to win.

(image/jpeg; 10.94 MB)

Baidu hits the turbo button to get back into AI race 26 Apr 2025, 1:28 am

An industry analyst Friday offered a lukewarm response to a series of announcements from Chinese tech giant Baidu around upgrades to its multimodal foundation model, ERNIE 4.5, and reasoning model, ERNIE X1, first released last month.

During his keynote at the firm’s annual developer conference in Wuhan, China, CEO Robin Li launched ERNIE 4.5 Turbo and ERNIE X1 Turbo, which, according to a release, feature “enhanced multimodal capabilities, strong reasoning, low costs and are available for users to access on Ernie Bot now free of charge.”

Li said, “the releases aim to empower developers to build the best applications — without having to worry about model capability costs, or development tools. Without practical applications, neither advanced chips nor sophisticated models hold value.”

At the launch of the new models’ predecessors last month, Baidu said in a release that the introduction of the two offerings “pushes the boundaries of multimodal and reasoning models,” adding that ERNIE X1 “delivers performance on par with DeepSeek R1 at only half the price.”

The firm said it plans to integrate both new models into its product ecosystem, and that the integration will include Baidu Search, China’s largest search engine, as well as other offerings.

According to a Reuters report, during his keynote Li also announced that Baidu had “successfully illuminated a cluster comprising [of] 30,000 of its self-developed, third generation P800 chips, which can support the training of DeepSeek-like models.”

Analysts unimpressed

Paul Smith-Goodson, vice president and principal analyst for quantum computing, AI and robotics at Moor Insights & Strategy, was unimpressed.

“[Baidu’s] announcement that the P800 Kunlun chip clusters were ‘illuminated’ only means they were turned on in preparation for training models with hundreds of billions of parameters,” he said. “While that is a technical advancement for China, it is the norm for companies such as OpenAI, Google, IBM, Anthropic, Microsoft, and Meta to train their models with hundreds of billions of parameters.”

Also, said Smith-Goodson, “Baidu’s statement that it used 30,000 Kunlun chips is nothing exceptional when compared to the number of GPUs the US uses to train large models. Kunlun chips are also inferior to US GPUs. In the next-gen AI we will be using something on the order of 100,000 GPUs. Because there is a lack of benchmarks, I have to be skeptical about the performance of this model compared to global leaders.”

Smith-Goodson pointed out, “it boils down to a race between China and the US to build the first Artificial General Intelligence (AGI) model. The US still holds a lead, but China is pressing hard to catch up.”

Thomas Randall, director of AI market research at Info-Tech Research Group, was also lukewarm about the announcements. Still, he pointed out, “Baidu remains an important part of China’s competitive AI sector, which includes companies like Alibaba, Tencent, and Huawei.”

Baidu’s ERNIE models, he said, “are one of the few domestically developed LLM series that compete with OpenAI/GPT-level models. The Kunlun chips and new cluster announcement reinforce that Baidu isn’t just doing models. Baidu has become a broad provider for hardware and applications, too.”

Strategically relevant but commercially limited

However, Randall said, Baidu “remains under immense pressure from emerging startups like DeepSeek, Moonshot AI, and the cloud giants like Alibaba. While still a heavyweight, Baidu is not unchallenged in China.”

He added that, across western countries, Baidu remains largely irrelevant because of the lack of trust in geopolitics, and the decoupling of the US and Chinese tech ecosystems. “[This] makes Western expansion near impossible. Moreover, in global AI model benchmarks, Baidu is mostly a secondary mention against the likes of OpenAI, Anthropic, Google, and Mistral.”

But overall, said Randall, “Baidu remains strategically relevant globally, but commercially limited across the West. The key takeaway for western AI companies is that innovation is not US-centric, but that only assists in pushing the AI race forward.”

(image/jpeg; 0.54 MB)

Thesys introduces generative UI API for building AI apps 26 Apr 2025, 12:21 am

AI software builder Thesys has introduced C1 by Thesys, which the company describes as a generative UI API that uses large language models (LLMs) to generate user interfaces on the fly.

Unveiled April 18, C1 by Thesys is available for general use. C1 lets developers turn LLM outputs into dynamic, intelligent interfaces in real time, Thesys said. In elaborating on the C1 technology, Thesys said enterprises now are racing to adopt AI, but building the front end for AI agents has remained a major hurdle. Teams spend months and significant resources designing and coding interfaces, only to deliver static, inconsistent, and often-disengaging user experiences, the company said.

Generative UI enables LLMs to generate interactive interfaces in real time, Thesys said. Generative UI interprets natural language prompts, generates contextually relevant UI components, and adapts dynamically based on user interaction or state changes. C1 can generate UI for any use case and any data, enables UI generations to be guided via system prompts, supports integration with external tools via function calling, and supports a wide variety of UI components via its Crayon React framework, according to Thesys.

More than 300 teams already are using Thesys tools to design and deploy adaptive AI interfaces, according to the company.

(image/jpeg; 8.97 MB)

Cloud native explained: How to build scalable, resilient applications 25 Apr 2025, 3:48 pm

What is cloud native? Cloud native defined

The term “cloud-native computing” encompasses the modern approach to building and running software applications that exploit the flexibility, scalability, and resilience of cloud computing. The phrase is a catch-all that encompasses not just the specific architecture choices and environments used to build applications for the public cloud, but also the software engineering techniques and philosophies used by cloud developers.

The Cloud Native Computing Foundation (CNCF) is an open source organization that hosts many important cloud-related projects and helps set the tone for the world of cloud development. The CNCF offers its own definition of cloud native:

Cloud native practices empower organizations to develop, build, and deploy workloads in computing environments (public, private, hybrid cloud) to meet their organizational needs at scale in a programmatic and repeatable manner. It is characterized by loosely coupled systems that interoperate in a manner that is secure, resilient, manageable, sustainable, and observable.

Cloud native technologies and architectures typically consist of some combination of containers, service meshes, multi-tenancy, microservices, immutable infrastructure, serverless, and declarative APIs — this list is not exhaustive.

This definition is a good start, but as cloud infrastructure becomes ubiquitous, the cloud native world is beginning to spread behind the core of this definition. We’ll explore that evolution as well, and look into the near future of cloud-native computing

Cloud native architectural principles

Let’s start by exploring the pillars of cloud-native architecture. Many of these technologies and techniques were considered innovative and even revolutionary when they hit the market over the past few decades, but now have become widely accepted across the software development landscape.

Microservices. One of the huge cultural shifts that made cloud-native computing possible was the move from huge, monolithic applications to microservices: small, loosely coupled, and independently deployable components that work together to form a cloud-native application. These microservices can be scaled across cloud environments, though (as we’ll see in a moment) this makes systems more complex.

Containers and orchestration. In could-native architectures, individual microservices are executed inside containers — lightweight, portable virtual execution environments that can run on a variety of servers and cloud platforms. Containers insulate the developers from having to worry about the underlying machines on which their code will execute. That is, all they have to do is write to the container environment.

Getting the containers to run properly and communicate with one another is where the complexity of cloud native computing starts to emerge. Initially, containers were created and managed by relatively simple platforms, the most common of which was Docker. But as cloud-native applications got more complex, container orchestration platforms that augmented Docker’s functionality emerged, such as Kubernetes, which allows you to deploy and manage multi-container applications at scale. Kubernetes is critical to cloud native computing as we know it — it’s worth noting that the CNCF was set up as a spinoff of the Linux Foundation on the same day that Kubernetes 1.0 was announced — and adhering to Kubernetes best practices is an important key to cloud native success.

Open standards and APIs. The fact that containers and cloud platforms are largely defined by open standards and open source technologies is the secret sauce that makes all this modularity and orchestration possible, and standardized and documented APIs offer the means of communication between distributed components of a larger application. In theory, anyway, this standardization means that every component should be able to communicate with other components of an application without knowing about their inner workings, or about the inner workings of the various platform layers on which everything operates.

DevOps, agile methodologies, and infrastructure as code. Because cloud-native applications exist as a series of small, discrete units of functionality, cloud-native teams can build and update them using agile philosophies like DevOps, which promotes rapid, iterative CI/CD development. This enables teams to deliver business value more quickly and more reliably.

The virtualized nature of cloud environments also make them great candidates for infrastructure as code (IaC), a practice in which teams use tools like Terraform, Pulumi, and AWS CloudFormation, to manage infrastructure declaratively and version those declarations just like application code. IaC boosts automation, repeatability, and resilience across environments—all big advantages in the cloud world. IaC also goes hand-in-hand with the concept of immutable infrastructure—the idea that, once deployed, infastructure-level entities like virtual machines, containers, or network appliances don’t change, which makes them easier to manage and secure. IaC stores declarative configuration code in version control, which creates an audit log of any changes.

Chart listing five things to love and five things to fear when considiering cloud native — There’s a lot to love about cloud-native architectures, but there are also several things to be wary of when considering it.

IDG

How the cloud-native stack is expanding

As cloud-native development becomes the norm, the cloud-native ecosystem is expanding; the CNCF maintains a graphical representation of what it calls the cloud native landscape that hammers home to expansive and bewildering variety of products, services, and open source projects that contribute to (and seek to profit from) to cloud-native computing. And there are a number of areas where new and developing tools are complicating the picture sketched out by the pillars we discussed above.

An expanding Kubernetes ecosystem. Kubernetes is complex, and teams now rely on an entire ecosystem of projects to get the most out of it: Helm for packaging, ArgoCD for GitOps-style deployments, and Kustomize for configuration management. And just as Kubernetes augmented Docker for enterprise-scale deployments. Kubernetes itself has been augmented and expanded by service mesh offerings like Istio and Linkerd, which offer fine-grained traffic control and improved security

Observability needs. The complex and distributed world of cloud-native computing requires in-depth observability to ensure that developers and admins have a handle on what’s happening with their applications. Cloud-native observability uses distributed tracing and aggregated logs to provide deep insight into performance and reliability. Tools like Prometheus, Grafana, Jaeger, and OpenTelemetry support comprehensive, real-time observability across the stack.

Serverless computing. Serverless computing, particularly in its function-as-a-service guise, offers to strip needed compute resources down to their bare minimum, with functions running on service provider clouds using exactly as much as they need and no more. Because these services can be exposed as endpoints via APIs, they are increasingly integrated into distributed applications, operating side-by-side with functionality provided by containerized microservices. Watch out, though: the big FaaS providers (Amazon, Microsoft, and Google) would love to lock you in to their ecosystems.

FinOps. Cloud computing was initially billed as a way to cut costs — no need to pay for an in-house data center that you barely use — but in practice it replaces capex with opex, and sometimes you can run up truly shocking cloud service bills if you aren’t careful. Serverless computing is one way to cut down on those costs, but financial operations, or FinOps, is a more systematic discipline that aims to aligns engineering, finance, and product to optimize cloud spending. FinOps best practices make use of those observability tools to best determine what departments and applications are eating up resources.

Advantages and challenges for cloud-native development

Cloud native has become so ubiquitous that its advantages are almost taken for granted at this point, but it’s worth reflecting on the beneficial shift the cloud native paradigm represents. Huge, monolithic codebases that saw updates rolled out once every couple of years have been replaced by microservice-based applications that can be improved continuously. Cloud-based deployments, when managed correctly, make better use of compute resources and allow companies to offer their products as SaaS or PaaS services.

But cloud-native deployments come with a number of challenges, too:

Complexity and operational overhead: You’ll have noticed by now that many of the cloud-native tools we’ve discussed, like service meshes and observability tools, are needed to deal with the complexity of cloud-native applications and environments. Individual microservices are deceptively simple, but coordinating them all in a distributed environment is a big lift.
Security: More services executing on more machines, communicating by open APIs, all adds up to a bigger attack surface for hackers. Containers and APIs each have their own special security needs, and a policy engine can be an important tool for imposing a security baseline on a sprawling cloud-native app. DevSecOps, which adds security to DevOps, has become an important cloud-native development practice to try to close these gaps.
Vendor lock-in: This may come as a surprise, since cloud-native is based on open standards and open source. But there are differences in how the big cloud and serverless providers works, and once you’ve written code with one provider in mind, it can be hard to migrate elsewhere.
A persistent skills gap: Cloud-native computing and development may have years under its belt at this point, but the number of developers who are truly skilled in this arena is a smaller portion of the workforce than you’d think. Companies face difficult choices in bridging this skills gap, whether that’s bidding up salaries, working to upskill current workers, or allowing remote work so they can cast a wide net.

Cloud native in the real world

Cloud native computing is often associated with giants like Netflix, Spotify, Uber, and AirBNB, where many of its technologies were pioneered in the early ’10s. But the CNCF’s Case Studies page provides an in-depth look at how cloud native technologies are helping companies. Examples include the following:

A UK-based payment technology company that can switch between data centers and clouds with zero downtime
A software company whose product collects and analyzes data from IoT devices — and can scale up as the number of gadgets grows
A Czech web service company that managed to improve performance while reducing costs by migrating to the cloud

Cloud-native infrastructure’s capability to quickly scale up to large workloads also make it an attractive platform for developing AI/ML applications: another one of those CNCF case studies looks at how IBM uses Kubernetes to train its Watsonx assistant. The big three providers are putting a lot of effort into pitching their platforms as the place for you to develop your own generative AI tools, with offerings like Azure AI Foundry,Google Firebase Studio, and Amazon Bedrock. It seems clear that cloud native technology is ready for what comes next.

Learn more about related cloud-native technologies:

(image/jpeg; 0.92 MB)

Docker’s new MCP Catalog, Toolkit to solve major developer challenges, experts say 25 Apr 2025, 12:24 pm

Docker, provider of containers for application development, is planning to add a new Model Context Protocol (MCP) Catalog and a Toolkit that experts say could solve major challenges faced by developers when building out agentic applications.

Anthropic’s MCP, which was released in November last year, is an open protocol that allows AI agents inside applications to access external tools and data to complete a user request using a client-server mechanism, where the client is the AI agent and the server provides tools and data.

Agentic applications, which can perform tasks without manual intervention, have caught the fancy of enterprises as they allow them to do more with constrained resources.

But without MCP, developers would face a major challenge: they would be unable to connect disparate data sources and tools with large language models (LLMs), without which agents cannot perform tasks on their own.

Docker, which is where at least 20 million developers build their applications, is adding Catalog and Toolkit as it says that despite MCP’s popularity, its experience is “not production-ready — yet.”

“Discovery (of tools) is fragmented, trust is manual, and core capabilities like security and authentication are still patched together with workarounds,” Docker executives wrote in a blog post.

Paul Chada, co-founder of DoozerAI, an agentic digital worker platform, said that presently, MCP servers are messy client-side installs and not true enterprise-grade solutions, meaning they run directly on users’ PCs and potentially expose credentials.

Catalog and Toolkit are expected to solve challenges related to tool discovery, credential management, and security, with the Catalog serving as the home for discovering MCP tools and Toolkit simplifying the process of running and managing MCP servers securely, the executives explained.

MCP Catalog to aid tools discovery

The Catalog, according to Docker, is essentially a marketplace where authors or builders of these tools can publish them for developers to discover.

To get the marketplace running, Docker has partnered with several companies, including Stripe, Elastic, Heroku, Pulumi, Grafana Labs, Kong Inc., Neo4j, New Relic, and Continue.dev, the executives said. At launch, the marketplace would contain over 100 verified tools.

Chada sees MCP Catalog serving as an accredited, secure hub and marketplace for MCP servers, which can then be subsequently deployed into Docker containers using the new MCP Toolkit.

“The Catalog will help developers find trusted or verified tools, which reduces the risk of security breaches,” Chada explained.

In the same vein, Moor Insights and Strategy’s principal analyst Jason Andersen said that Docker is probably the first to try and build a centralized place to discover tools related to MCP, which is non-existent presently as the protocol is very new.

MCP Toolkit for simple management of MCP servers

Docker’s MCP Toolkit, according to Chada, solves key developer challenges, such as environment conflicts, security vulnerabilities from host access, complex setup requirements, and cross-platform inconsistencies, by offering features

“To bypass developer challenges, the Toolkit offers a one-click deployment from Docker Desktop, built-in credential management, containerized isolation, a Gateway Server, and a dedicated command line interface (CLI),” Chada said.

“By containerizing MCP servers, Docker creates a standardized, secure environment where developers can focus on building AI applications rather than wrestling with configuration and security issues,” Chada added.

Both Catalog and Toolkit are expected to be made available in May, the company said. However, it is yet to finalize the pricing of both offerings.

Both Chada and Andersen believe that Docker’s rivals, such as Kubernetes, are also expected to add similar capabilities soon. Andersen further believes that most cloud service providers will also start offering something similar to the Catalog and Toolkit as MCP’s popularity grows.

(image/jpeg; 0.38 MB)

Python and WebAssembly? Here’s how to make it work 25 Apr 2025, 11:00 am

Execution falls short

Agentic AI remains more conceptual than practical. For all its potential, the technology has failed to demonstrate widespread adoption or scalability in enterprise contexts. We hear a lot about self-directed systems transforming industries, but evidence of meaningful deployment is painfully scarce.

Deloitte’s recent AI survey found that only 4% of enterprises pursuing AI are actively piloting or implementing agentic AI systems. The vast majority remain trapped in cautious experimentation. This gap isn’t surprising given the challenges involved. Agentic AI requires advanced reasoning, contextual understanding, and the ability to learn and adapt autonomously in complex, unstructured environments. This level of sophistication is still aspirational for most organizations.

Furthermore, infrastructure and cost hurdles are daunting. A recent Gartner report revealed that rolling out agentic AI projects often costs two to five times more than traditional machine learning initiatives. These systems demand extensive training data, advanced processing power, and robust integration with existing workflows—investments not all enterprises are prepared to make.

Where the disconnect lies

Agentic AI adoption often stumbles for two key reasons: technological immaturity and overblown expectations. The technology promises autonomous decision-making, but it struggles to handle edge cases, unpredictable variables, and the nuances of human decision-making contexts in practical scenarios. I’ve seen this firsthand.

Consider self-driving vehicles, touted for years as a flagship example of agentic AI. Although companies like Tesla and Waymo have made progress, full autonomy remains a distant goal fraught with technical setbacks. Enterprises pursuing agentic AI quickly encounter similar pitfalls where the systems falter in dynamic, real-world scenarios that require judgment and adaptability.

These examples highlight the widening gap between marketing rhetoric and implementation capabilities. The hype promises revolutionary change, yet real progress is slow and incremental.

Reassess your approach

Hype-driven initiatives rarely end well. Enterprises that invest in agentic AI without a clear road map for value creation risk wasting time, money, and resources. Instead of chasing the flashiest new technology, organizations should concentrate on their specific needs and measurable outcomes. Large-scale agentic AI solutions may not provide the answer. Many organizations could achieve a better ROI by implementing simpler AI tools, such as recommendation systems or predictive analytics that integrate seamlessly into existing workflows.

The path to meaningful AI adoption starts with clarity. Before scaling, enterprises should prioritize pilot programs and test agentic AI in controlled environments. These tests should be accompanied by key performance indicators that track measurable performance, such as cost savings and improvements in process efficiency.

Additionally, infrastructure readiness is crucial. Agentic AI typically requires robust data sets, seamless integration, and a commitment to addressing ethical concerns such as bias and accountability. Without these elements, projects are likely to fail.

Enterprises also need to hold vendors accountable. Too much of today’s agentic AI marketing lacks transparency and makes bold claims without providing adequate proof points or benchmarks. Ask questions. Get objective answers. Businesses must demand deeper insights into scalability, deployment timelines, and technical limitations to make informed decisions.

Managing hype versus value

Agentic AI has undeniable potential, but its current state is overhyped and underdelivered. Enterprises rushing to adopt these technologies risk falling into expensive traps, seduced by promises of autonomy without understanding the underlying complexities.

Organizations can avoid the pitfalls of hype-driven adoption by focusing on immediate business needs, prioritizing incremental AI solutions, and demanding transparency from vendors. This should not be a race to be the first to adopt agentic AI—it should be about adopting it in the smartest ways possible. The best path forward for the vast majority of enterprises is to wait for the technology to mature while pursuing today’s more pragmatic AI initiatives.

Ultimately, AI success in enterprises isn’t about chasing the headlines; it’s about creating real, measurable value. By staying grounded in practical realities, businesses will position themselves for sustainable growth today and in the future when agentic AI finally fulfills its potential.

(image/jpeg; 2.58 MB)

Databricks to infuse $250M to double its R&D staff in India this year 24 Apr 2025, 12:07 pm

Databricks is planning to double its research and development (R&D) staff in India by the end of this year in an effort to accelerate the development of new capabilities and large language models (LLMs).

“This year, we plan to hire an additional 100-plus R&D engineers to strengthen our capabilities,” Vinod Marur, senior vice president of Engineering at Databricks, said during a media briefing.

The data lakehouse platform provider opened its R&D hub in Bengaluru in 2023, and since then, it has hired around 100 engineers to support the division.

The India R&D team, according to Marur, contributes to the development of LLMs, new capabilities such as the Lakeflow Connect, and aids in infrastructure management for enterprise customers.

Adding more engineers in India is driven by the shortage of AI skillsets globally and the availability of technology talent at scale in the country.

Other R&D centres of the company are located in Amsterdam, Belgrade, Berlin, San Francisco, Mountain View and Seattle.

The R&D team expansion is part of the company’s $250 million planned investment in India for the next three years, which Ed Lenta, SVP for Databricks Asia Pacific & Japan, attributed to its growing customer base, demand for data modernization, and proliferation of generative AI in the country.

To that effect, Lenta said that a majority of the $250 million investment will go into hiring new staffers, including R&D and additional go-to-market and customer success professionals.

“We want to give our Indian customers an amazing experience, and so we’re going to be investing significantly in the go-to-market organization to serve Indian startups and other enterprise customers,” Lenta said during the briefing.

In 2023, Databricks said that it would increase its 250-strong workforce by 50%, taking the staff count to 375. It has already crossed that mark and employs about 500 people in India. Lenta said that Databricks plans to increase its staff count to 750 by the end of the year.

Its customers in India include HDFC Bank, Myntra, Zepto, Dream 11, Air India, Parle, and TVS.

As part of the $250 million investment, the company also plans to scale up its learning and enablement programme via the new Data + AI academy.

The academy aims to train 500,000 partners and customers in India over the next three years on data modernization and generative AI, Lenta said.

(image/jpeg; 0.11 MB)

Micro front ends on the Microsoft web platform 24 Apr 2025, 11:00 am

Breaking up monolithic code into clusters of services and microservices has been a good thing. It has allowed us to refactor our code and find the bottlenecks and blockers. In some cases, that’s led us back to new, streamlined monoliths, in other cases to building distributed cloud-native platforms.

But there’s one area that’s generally resisted that change: our application front ends, especially those built using web technologies. We’ve gone from forms and tables to JavaScript-powered single-page applications. But despite those changes, the web is still stubbornly monolithic, slowing down development and making it hard to quickly build necessary changes.

Introducing micro front ends

So, why not bring the microservices model to the front end? A web page is simply a container for various page elements that can be treated as components. We can break up a page into sets of functional components that can be managed by different development teams, using themes to customize them. This lets us build common reusable elements, which can be supported outside of web content and reused across different pages, linking to APIs as necessary.

This idea isn’t particularly new, it was first discussed in 2016 in ThoughtWorks’ Technology Radar report, where it was given the name “micro frontends.” As the model of web development has transitioned from dynamic single-page applications to more CDN-friendly static websites, the idea has become more popular.

One other benefit of the micro front ends approach is that it allows you to develop expertise in specific domains rather than in specific applications. For example, building your search functions in a single component means a team can be given specific targets for your various front ends and tune and customize its components. The same would be true for shopping carts, navigation, or any other services you need.

It’s even possible to break things down further. You can have a team that builds and maintains your grid controls or delivers the elements needed to navigate and display catalog contents. Using a common library of design and page elements simplifies delivering and managing front ends, so you can pick and choose components as needed.

Micro front ends and Microsoft

A lot of work has gone into the philosophy and techniques necessary to deliver micro front ends and how you can build architectures that construct an application out of components, using familiar tools. Microsoft has been using the micro front ends approach internally in Azure, but it has also been investing in ways to use it in familiar development tools and frameworks, such as ASP.NET Core and Blazor.

Unfortunately, the current approach does make things a little disjointed; there’s no one way to deliver micro front ends with Azure, so you’ll have to build the necessary architecture yourself. Key technologies to explore include publishing static content to Azure Static Web Apps, as well as the View Components and WebAssembly tools in ASP.NET Core.

Microsoft’s web application frameworks are particularly well suited to micro front ends, as they build on the component model used in XAML desktop applications and the lessons learned in delivering cloud-native applications at scale in Azure. Other tools, such as Azure Static Web Apps, make it possible to build front ends that are suitable for CDNs.

Using View Components to host micro front ends

One approach that can help you transition to micro front ends is ASP.NET Core’s View Components. All you need to do is pass data to a View Component; there’s no model binding. View Components offer all the features necessary to encapsulate functionality in a single block of code, including their own business logic. They work asynchronously and take parameters that are passed through a host page’s logic. In practice, you treat the host as a controller and a view component as a view.

Outside of ASP.NET Core, Microsoft has developed its own React library to deliver micro front ends. Written in TypeScript and available from GitHub, the project provides tools to build both host applications and micro front end components. Data is shared between host and components as context. There’s not much documentation, but you can use the sample code to see how you can implement the library in your applications.

One key point about micro front ends is that they are technology agnostic. The DOM and the web components specification (specifically Custom Elements) are the glue that holds components together. There’s no need to standardize on one web development framework. Each independent element can be built using its development team’s choice of tools.

From HTML and JavaScript to WebAssembly

You can also progressively migrate functionality from HTML and JavaScript to WebAssembly. Blazor offers a framework to host, link, and manage your WebAssembly components using tools like lazy loading to load them as needed, rather than all at once when a page is opened. ASP.NET Core’s route model can manage asynchronous navigation events, helping link UI components through a host that runs inside your web application.

Components can then be delivered as WebAssembly. They are treated as individual containers, and you can use your choice of development languages. With WinUI 3 controls available for WebAssembly, you can quickly build rich user experiences that run in most browsers.

There are many ways to host and deploy micro front ends in Azure. One interesting possibility is to use the lazy loading techniques in many JavaScript frameworks alongside the Azure CDN platform. You’re not limited to the web. Micro front end components can be delivered in React Native or ReactJS, allowing you to separate controls from a common core that can be delivered to specific platforms. Microsoft uses this approach with its Family Safety app, written in Kotlin for Android and Swift for iOS, calling React Native. The same components can be reused in Windows or on the web as ReactJS.

Here Microsoft has developed what it calls a back end for front-end service, which provides a consistent API for components and delivers common visual elements. A layer like this lets you have one common authentication environment managed by the back end for the front end. At the same time, you can use this layer to deliver application-specific business logic allowing micro front end components to be reusable across different applications.

Using Azure Front Door

Azure’s Content Delivery Network allows you to cache static content outside of the core Azure network, using a set of local and metro-level cache servers to deliver content as needed. You’re probably familiar with using Azure CDN and infinite scrolling page designs to avoid loading all of your content at once. The same techniques (and calls) can be used to download micro front end components.

Now part of Azure Front Door, Azure’s CDN tool is the gateway to Azure Static Web Apps sites, ensuring that all your micro front end components are cached and that the service will manage site content for you. It’s a good idea to keep an eye on the time-to-live settings of your cache content. If this is set too high and you push an update to a key component, users will not see it before the cache expires unless you purge the cache. By making a purge part of every deployment, you can ensure that users will get the new code as soon as it’s published.

Because micro front ends are technology agnostic and built on web standards, you’ve got a wide choice of frameworks to use with Azure Static Web Apps, from Angular and React to Next.js and Blazor. This fits with the philosophy that micro front end dev teams have the freedom to build the components they’re responsible for in their choice of tools, with everything coming together in the host template.

Once you’ve started building a library of common micro front end components, you can treat them as an inner source project, where internal teams focus on the needs of the organization rather than a single project. That may mean not meeting your one specific need or having features you won’t use, but at the same time, you should gain quicker bug fixes, improved performance, and a common developer experience with the rest of your organization, making it easier for developers to shift from project to project as needed.

(image/jpeg; 9.33 MB)

MarkItDown: Microsoft’s open-source tool for Markdown conversion 24 Apr 2025, 11:00 am

The rapid evolution of generative AI has created a pressing need for tools that can efficiently prepare diverse data sources for large language models (LLMs). Transforming information that is encoded in various file formats into a structure that LLMs can readily understand is a significant hurdle. Addressing this, Microsoft has open-sourced MarkItDown, a powerful utility designed to convert file content into Markdown.

MarkItDown is an open-source Python utility that simplifies converting diverse file formats into Markdown. With its robust capabilities, MarkItDown addresses challenges in document processing and plays a pivotal role in workflows involving LLMs.

Project overview – MarkItDown

MarkItDown is available both as a Python library and a command-line tool. Released only months ago, it has quickly garnered attention within the developer community, amassing significant interest on GitHub (currently ~50k stars). Its primary goal is to act as a universal translator, converting PDFs, text files, office documents, and even rich media into clean Markdown text. Unlike some converters that focus solely on text extraction, MarkItDown prioritizes preserving essential document structures like headings, lists, tables, and links, making the output highly suitable for text analysis pipelines and LLM ingestion.

By leveraging advanced technologies such as optical character recognition (OCR) and speech recognition, MarkItDown extracts content from images and audio files. This functionality makes it a versatile tool for tasks like indexing, text analysis, and preparing data sets for LLM training.

What problem does MarkItDown solve?

Data scientists face challenges when working with content across multiple files in diverse formats when implementing retrieval-augmented generation (RAG) solutions. Extracting useful information from documents like PDFs, images, or spreadsheets can be time-consuming and error-prone. Traditional tools frequently lack the flexibility to handle diverse formats or maintain document structure during conversion.

Key pain points include:

Difficulty in extracting content from non-standard file types.
Loss of formatting when converting structured documents like tables or lists.
Limited support for multi-modal data such as images and audio.

MarkItDown overcomes these obstacles by offering a unified solution for converting files into Markdown while preserving essential document structure and metadata.

A closer look at MarkItDown

MarkItDown has a modular and extensible architecture. At its core is the DocumentConverter class, which defines a generic convert() method. Specialized converters inherit from this base class to handle specific file types dynamically.

For example, converting a Microsoft Excel file to Markdown involves a few lines of code.


from markitdown import MarkItDown

md = MarkItDown()
result = md.convert("example.xlsx")
print(result.text_content)

For image descriptions, you need to configure an LLM client. The below example uses GPT-4o to help MarkItDown convert an image into markdown.

from openai import OpenAI
from markitdown import MarkItDown

client = OpenAI(api_key="your-api-key")
md = MarkItDown(llm_client=client, llm_model="gpt-4o")
result = md.convert("example_image.jpg")
print(result.text_content)

Some of the key features of MarkItDown include:

Multi-modal capabilities: Processes images (using an integrated LLM like GPT-4o for description generation) and audio files (using speech recognition libraries for transcription).
Multi-format support: Converts Office files, HTML, JSON, XML, images (with OCR), audio (with transcription), and more.
Structure preservation: Preserves and maintains document structure (headings, lists, tables) during conversion.
LLM integration: Enhances image processing by generating descriptions using LLMs such as GPT-4o.
In-memory processing: Eliminates the need for temporary files during conversion.
Plug-in architecture: Allows developers to add custom converters for new formats easily.

Under the hood, MarkItDown uses libraries including python-docx (via mammoth), pandas, python-pptx, BeautifulSoup, speech_recognition, and pdfminer.six for handling different formats.

Despite its utility, MarkItDown also has limitations. For example, critics point out that MarkItDown functions largely as a wrapper around existing third-party libraries (like mammoth and pandas) rather than offering novel conversion capabilities or leveraging Microsoft’s internal knowledge of its own Office formats.

Other significant shortcomings of MarkItDown include:

Cannot process PDFs that lack prior OCR.
Strips all text formatting from PDFs, like headings and lists, during extraction.
Sometimes fails to recognize text within images embedded in PDFs.
Extracting descriptive content from standalone images requires configuring and using an external LLM client.
Active GitHub issues highlight ongoing problems such as incorrect image link extraction and potential loss of dynamic data when converting HTML.

Key use cases for MarkItDown

MarkItDown has four primary use cases:

LLM data ingestion: Converting internal documents, reports, spreadsheets, and presentations into Markdown for fine-tuning LLMs or building retrieval-augmented generation (RAG) systems.
Knowledge base creation: Transforming diverse company files into a unified Markdown format for searchable knowledge bases.
Text analysis pipelines: Standardizing input from various file types before feeding them into text analysis or data extraction workflows.
Content migration: Converting legacy documents into Markdown for modern documentation systems or websites.

Bottom line – MarkItDown

A valuable open-source contribution from Microsoft, MarkItDown directly addresses the crucial challenge of data preparation for LLMs. By offering a simple yet powerful way to convert many different file formats into structured, LLM-friendly Markdown, MarkItDown significantly lowers the barrier for developers looking to leverage diverse data sources in their AI applications. Its extensibility via plugins, permissive MIT license, and focus on preserving the structure of converted documents make it a compelling tool for anyone working at the intersection of data and generative AI.

(image/jpeg; 2.92 MB)

Microsoft touts AI Dev Gallery for Windows 24 Apr 2025, 11:00 am

Microsoft is championing its AI Dev Gallery, now available as an open source app intended to help Windows developers integrate AI capabilities within their own projects and apps.

Initially announced as a project in December 2024, AI Dev Gallery was highlighted on April 22 as a mechanism to simplify AI development with .NET through samples, easy model downloads, and exportable source code. It is downloadable from the Microsoft Store, which lists it as a preview.

Serving as a playground for AI development using .NET, AI Dev Gallery allows developers to experiment with and implement AI capabilities in applications without needing a connection to cloud services. Included in AI Dev Gallery are dozens of interactive samples including retrieval-augmented generation (RAG), combining search with generative AI, chat interfaces, object detection, text-to-speech/speech-to-text conversion, and document summarization and analysis. Samples run on the developer’s local machine.

A key feature of AI Dev Gallery is the ability to view the C# source code behind each sample and export it as a standalone Visual Studio project with a single click, Microsoft said. To find and set up AI models, AI Dev Gallery allows developers to browse models from repositories such as Hugging Face and GitHub and download models via a single click. The gallery handles model compatibility, ensuring users get versions that work with the .NET ecosystem.

(image/jpeg; 10.22 MB)

Puppet devsecops updated to deal with security maladies 23 Apr 2025, 9:18 pm

Perforce has updated its Puppet Enterprise Advanced platform for devsecops to offer more advanced remediation options with the goal of reducing risk in an era of AI-powered security threats.

Announced on April 22, the 2025.2 release of the platform fosters greater collaboration between platform and security teams, Perforce said. The company stressed that the divide between infrastructure management and security operations can delay a swift response. Today’s evolving threat landscape is becoming more sophisticated and agile, in part due to the misdirected power of AI.

Puppet’s latest platform update lets enterprises rapidly address known vulnerabilities by integrating security remediation within core infrastructure workflows to speed up responsiveness to identified threats, according to Perforce. Embedding this capability in established processes gives operations and security teams a shared understanding of their security posture and automates critical remediation tasks to eliminate inefficient cross-functional handoffs.

A collaborative environment shrinks the opportunity window for attackers, said Perforce. Future Puppet releases will speed up the pace further with human-in-the-loop, AI-driven automation. Puppet enables the platform team to support security initiatives to boost resiliency and reduce the MTTR (mean time to remediate).

Remediation of known vulnerabilities is streamlined by integration with security scanners. Tenable Nessus is included out of the box, and other popular scanners are available via an extension architecture and API framework. Also, self-service workflows and cross-functional visibility break down siloes to eliminate efficiency bottlenecks and accelerate resolution, Perforce said.

(image/jpeg; 8.83 MB)

JDK 25: The new features in Java 25 23 Apr 2025, 6:03 pm

Java Development Kit (JDK) 25, a planned long-term support release of standard Java due in September, now has four features officially proposed for it. The two latest features include module import declarations and compact source files and instance main methods, features previously previewed and now set for finalization in JDK 25.

Both features showed up April 21 on the JDK 25 reference implementation page on the OpenJDK website. Prior to these features, the first feature had been a preview of an API for stable values, which promises to speed up the startup of Java applications. The second was the removal of the previously deprecated 32-bit x86 port.

JDK 25 comes on the heels of JDK 24, a six-month-support release that arrived March 18. As a long-term support release, JDK 25 is set to get at least five years of premier-level support from Oracle. JDK 25 is due to arrive as a production release on September 16, following rampdown phases in June and July and two release candidates planned for August. The most-recent LTS release has been JDK 21, which arrived in September 2023.

Module import declarations, which has been previewed in JDK 24 and JDK 23, enhances the Java language with the ability to succinctly import all of the packages exported by a module. This simplifies the reuse of modular libraries but does not require the importing code to be in a module itself. It includes the following goals:

Simplifying the reuse of modular libraries by letting entire modules be imported at once
Avoiding the noise of multiple type import-on-demand declarations when using diverse parts of the API exported by a module
Allowing beginners to more easily use third-party libraries and fundamental Java classes without having to learn where they are located in a package hierarchy
Ensuring that module import declarations work smoothly alongside existing import declarations. Developers who use the module import feature would not be required to modularize their own code.

Compact source files and instance main methods would evolve the Java language so beginners can write their first programs without needing to understand language features designed for large programs. Beginners can write streamlined declarations for single-class programs and seamlessly expand programs to use more advanced features as their skills grow. Experienced developers can likewise enjoy writing small programs succinctly without the need for constructs intended for programming in the large. This feature has been previewed in JDK Versions 21, 22, 23, and 24, albeit under slightly different names. In JDK 24 it was called simple source files and instance main methods.

Stable values are objects that hold immutable data. Because stable values are treated as constants by the JVM, they enable the same performance optimizations that are enabled by declaring a field final. But compared to final fields, stable values offer greater flexibility as to the timing of their initialization. A chief goal of the proposal, which is in a preview API stage, is improving the startup of Java applications by breaking up the monolithic initialization of application state. Other goals include enabling user code to safely enjoy constant-folding optimizations previously available only to JDK code. This guarantees that stable values are initialized at most once, even in multi-threaded programs, and decouples the creation of stable values from their initialization, without significant performance penalties.

Removal of the 32-bit x86 port, currently proposed to target JDK 25, involves removing both the source code and build support for this port, which was deprecated for removal in JDK 24. In explaining the motivation, the JEP states that the cost of maintaining this port outweighs the benefits. Keeping parity with new features, such as the foreign function memory API, is a major opportunity cost, according to the JEP. Deprecating and removing the port will allow OpenJDK developers to accelerate development of new features and enhancements.

Other features that could find a home in JDK 25 include other features previewed in JDK 24, such as a key derivation function API, scoped values, structured concurrency, flexible constructor bodies, and primitive types in patterns, instanceof, and switch. A vector API, incubated nine times from JDK 16 to JDK 24, also could appear in JDK 25.

(image/jpeg; 0.13 MB)

Cubicles are a software development anti-pattern 23 Apr 2025, 11:00 am

I have yet to meet a software developer who said, “I just love working in a cube farm.” I’ve never run across a developer who would turn down an offer to work in their own office. And I’ve never met a software developer who said, “You know, distractions and interruptions are great for my productivity!”

But I have met plenty of software development managers who think that developers need nothing more than a workstation, a network connection, and a few inches of elbow room. Now, I have never actually heard a development manager say, “I’d love to create a working environment designed to stymie concentration!” or “Let’s set things up so it’s super easy to interrupt the developers with a question.” But they seemed to have those goals.

In a past career, I had an opportunity to help design a new building for a software company. I explicitly and willfully didn’t want to be one of the aforementioned managers who subconsciously (consciously?) created a working environment designed to limit developer productivity. When the architects called me in to get my input on how the building was to be designed, I had only two things on my list: indirect lighting and offices for the developers.

Here is my mini-rant about lighting. Most offices have those hideous fluorescent lights with the checkerboard coverings that shine their harsh light right down on your eyes and that drive you nuts with the buzzing and occasional flickering.

Lighting in an office should point upwards, softly bouncing light off the ceiling. This seems blatantly obvious to me. And people wonder why I wear a baseball hat in the office.

A door for every developer

The other item on my list? Offices with a door for every developer. Small offices would be fine. Even offices with frosted glass fronts would be great. Developers need a quiet personal space, where they can concentrate, as Borland understood. But alas, it was not to be. The offices were deemed too expensive. Even high cubicles with doors cost too much. Alas.

Old school companies seemed to understand a bit more what developers need. Joel Spolsky understood it as far back as 2003. He was well aware that the “standard” space just wouldn’t cut it for developers.

At its peak, Borland built a beautiful campus with a magnificent office building made up of six three-story pods. The third floor of each pod was dominated by offices pretty much everywhere. These spaces let the software team shut the door and actually think. Borland’s culture was such that only a reverent soul would venture onto the third floor to talk to a developer.

The unwritten rule was that if a developer’s door was closed, they were to be left alone. If the door was open a crack, you could knock, but you’d better have a good reason. An open door was an invitation to come in and talk.

Now that’s a culture that understands developer productivity.

As we are all too well aware, cubicle farms are the norm, despite their obvious disadvantages. And if cube farms weren’t bad enough, some managers started advocating for “open offices” where no one has an office, not even a cubicle, because the alleged collaboration benefits outweigh the disadvantages that common sense tells us these silly office layouts bring. Sometimes all I can do is shake my head.

Concentration-cancelling workspaces

“Solutions” for these manager-made problems often include, “Well, just wear noise-cancelling headphones.” That’s great, but that is a symptom, not a cure. If developers have to pipe white noise into their ears to be able to concentrate, maybe the problem isn’t your developers not being able to adapt, but your cheap, “modern” office layout.

Open offices are supposed to lead to all kinds of spontaneous collaboration and serendipitous meetings that lead to marvelous ideas. Great in theory, but not so great in practice. Developers don’t lack collaboration. They lack uninterrupted time to do their work.

We pay software developers big salaries and buy them expensive computers, and then we skimp on their workspaces. We hire them to produce work that requires deep concentration, so why not let them do that? No one ever thinks “Hey! Let’s have open spaces for the orchestra members to individually practice!” No, they get their own practice rooms.

You don’t put a thoroughbred in a stall for pack mules. So the next time you’re planning a workspace, ask yourself: Do I want software that runs fast and stable—or just a lot of developers wearing headphones pretending they are concentrating?

(image/jpeg; 1.67 MB)

Comparing Angular, React, Vue, and Svelte: What you need to know 23 Apr 2025, 11:00 am

The last time I compared the leading reactive JavaScript frameworks for InfoWorld was in 2021. It’s amazing how much has changed since then. All three of the frameworks I covered—Angular, React, and Vue—have made enormous strides with new features and deep refactors. Angular, in particular, is a modern Renaissance story, owing much of its success to greatly improved developer experience. Svelte has also established itself as a lasting and valuable option, worthy of being included in the current comparison.

The evolution of reactivity

Given all the advancements and embellishments, it’s worth taking a moment to ground ourselves in what is considered a “reactive framework.” At heart, reactivity is all about data binding. This is the mechanism whereby the framework takes data (a.k.a., “state”) and binds it to events so that changes to the events are automatically reflected. Data binding is a simple idea that nets enormous benefits in manageability. It is foundational to the present era of web user interfaces.

A huge range of additional capabilities have sprung up around this basic definition of reactivity. Many reactive frameworks have blossomed into full-blown platforms that include build toolchains, server-side rendering and API engines, integrated deployment adaptors, and plenty more. The most significant of these is the move toward server-side (or full-stack) support. Each of the frameworks here has at least one good full-stack option.

All four frameworks are known for their high quality. There are differences in style, syntax, and philosophy, but they’ve all benefitted from intensive development by teams of talented developers. (If you enjoy programming for programming’s sake, you can find a lot to appreciate by poking around their respective GitHub projects.)

You won’t go wrong using any of these frameworks. But there are some notable differences.

Usage statistics

All the frameworks here are very popular. NPM downloads are a good indicator of actual usage, and by this measure, React is an absolute behemoth. Its more than 30 million weekly NPM downloads imply north of one billion annual downloads of React. On the other side of the scale is Svelte, the most humble of the four, with “only” 1.7 million weekly downloads.

NPM weekly downloads: March 25, 2025	Downloads
React	33,918,683
Vue	6,302,238
Angular	3,683,390
Svelte	1,795,976

A snapshot of NPM weekly downloads in March 2025.

Much of this activity is from automated build systems and the like, but downloads are still a reliable indicator of relative use. React is the mainstream, conservative choice, but the other three have no trouble holding their own.

GitHub stars

Next, let’s consider GitHub stars as an indication of developer interest.

GitHub Stars: March 25, 2025	Stars
React	234,000
Vue	209,000
Angular	97,300
Svelte	82,000

GitHub stars as a measure of framework popularity in March 2025.

This chart is more balanced. React and Vue are neck-and-neck, as are Svelte and Angular. Even on the low side, these numbers indicate that developers are interested in all four libraries. Svelte’s 82,000 stars indicate a very active project, and I expect its popularity to continue growing.

The important thing to recognize about GitHub stars is that a human being bothered to leave one. Tens of thousands of developers have expressed interest in each of these projects. Many of them have gone one step further by contributing to a project. This is all indicative of a very healthy open source ecosystem.

Stack Overflow and Reddit

Now let’s consider the frameworks’ standing on two social sites popular with developers.

Framework	Stack Overflow watchers	Stack Overflow questions	Reddit users
React	521,300	482,600	452,000
Vue	117,200	108,800	119,000
Angular	228,400	308,300	80,000
Svelte	4,900	6,300	45,000

Measuring attention to the top reactive frameworks on Stack Overflow and Reddit.

First, we’re looking at the “watchers” (users subscribed to the framework as a topic) and the number of questions asked on Stack Overflow. Rather than indicating a shortcoming with Vue and Svelte, I believe their comparatively lower numbers may be due to developers using channels like Reddit to find the answers they need.

The numbers speak for themselves in terms of React’s dominance, but all are active, viable communities.

Community involvement

Finally, I’m going to go out on a limb and offer a more subjective characterization based on each project’s GitHub activity. Here is how community-driven each project currently looks to me.

Framework	Community involvement
React	Medium
Vue	Medium
Angular	Very high
Svelte	High

How community-driven is it?

Angular and React are both backed by big tech (Facebook and Google, respectively, or Meta and Alphabet if you prefer), but the developer community around them is still very grassroots. With React, Facebook initially bridged corporate sponsorship with the open source community. Currently, though, it seems the React team is designing things from within and issuing them out to the community. Angular has done the opposite, opting for community-first engagement to guide changes to the project.

Svelte and Vue both started life as the brainchild of an individual, then grew big communities of users and contributors. For a time, Vue was very active in the open source space. More recently, it has lost some of that dynamism, possibly because its founder is busy building another popular tool in Vite. Svelte still maintains that open source style of openly debating issues and changes, although currently not to the same extent as the Angular team.

Now that we have an overview of the four frameworks in the overall developer space, let’s look at how they compare based on usability and features.

Which one is easiest to learn and use?

In this day and age, we can readily use AI to generate component examples of development frameworks. But AI won’t tell us how it feels to use a framework, especially as you add more elements to the project. What you want is something that starts simple (they all do) and keeps that simplicity as much as possible as the project becomes more complex.

If you can still dip into the code and make changes in a sensible, controllable way—even after API calls are flying, component states are interacting, and stores are firing events—you are doing great. Much of that depends on how you use the tools, more than the tools themselves. But the frameworks do have their role to play. Svelte is the easiest platform to pick up. The purpose of its code is obvious, meaning it is easy to read and understand. Angular has made huge strides by eliminating module requirements, and even Vue and React are easy for new users to start learning.

Here’s how I rank the frameworks in terms of simplicity and ease of learning:

Svelte
Vue
React
Angular

Angular is still the biggest mental meal to swallow, but if you’ve been burned in the past, it’s well worth a revisit. These frameworks compete and influence each other on the developer experience (DX) front, and they are all well-designed as a result.

Code comparison of React, Vue, Angular, and Svelte

Next, I’ve used Google’s Gemini Code Assist to generate component samples for each of the frameworks. I also reviewed and validated the samples as a best practice for working with AI-generated code. Let’s see what we can glean from the samples.

React


import React, { useState } from 'react';

function Counter() {
  const [count, setCount] = useState(0);

  return (
    
      Count: {count}
      
    
  );
}

export default Counter;

Here we have a functional component that makes use of the useState hook. The return value is JSX. This lets us describe a template for the view with an HTML-like syntax that is capable of accessing the count variable created by the useState call.

Combined with the onClick handler, we end up with a reactively wired-up UI, such that the count variable is incremented and reflected in the {count} displayed on the interface. This is a pretty clean arrangement. Probably the biggest hurdle is becoming familiar with the useState syntax and the functional syntax inside the onClick handler.

Angular


import { Component, signal } from '@angular/core';

@Component({
  selector: 'app-counter',
  standalone: true,
  template: `
    Count: {{ count() }}
    
  `,
})
export class CounterComponent {
  count = signal(0);

  increment() {
    this.count.update(value => value + 1);
  }
}

Angular is still the most verbose framework, and the component sample shows it. Like React and Vue, it needs an import call. The @Component decorator is required by Angular but you don’t need it with the other frameworks, where the component-hood is deduced by context.

Angular creates a custom component with the selector property, so this component introduces the app-counter element, which you have to know when using it in the parent. In general, Angular tends toward being more explicit, which requires more effort as a developer. On the other hand, it allows the Angular engine to be very aggressive about optimizations.

The {{ count }} interpolation expression is obvious and the (click) syntax for event handling is less verbose than React’s.

In Angular’s favor, you are engaging with signals, which have become something of a de facto standard, and popular for their reactive programming power.

Vue

Vue has a tight syntax, although I am not a fan of the framing template tags. The interpolation syntax is the same as Angular’s, and the event-handling syntax is comparably simple. (Personally, I prefer Angular’s use of parentheses on the handler property, which makes it explicit that you are calling a function.)

The setup attribute on the tag is a Vue 3 shortcut that sidesteps a lot of boilerplate we had in Vue 2. As with React and Angular, an import (ref) is required.

Svelte




Count: {count}

Svelte’s code doesn’t even require an import call. There’s very little to it but describing exactly what happens. Probably the tag is the only extraneous thing to remember. Keeping all the syntax front-of-mind is easy to do.

There is no added syntax (like useState) on the count variable, the interpolation syntax is as simple as possible, and the on:click is comparable to the others in this section (although, again, I kind of prefer Angular’s increment() with the parentheses). The surrounding formality is kept to a bare minimum and does not clutter the purpose of the component.

Comparing component highlights

To me, the most immediately obvious syntax is Svelte’s. Vue’s syntax is also very tight, even though I am not a huge fan of the framing template tags. React’s functional style is slick but the function definition can’t compete with Svelte’s compiler-driven elimination of such formality. (Although one could argue in favor of React’s JavaScript obviousness.)

Angular is still the most verbose, and it suffers from having to make an import call for such a simple use case. But in its favor, Angular forces you to engage with signals, yielding power over the long term.

Scalability and performance

All four frameworks provide the tools to develop highly optimized applications. Along with DX, performance is a major area of focus. You can see that focus in React’s adoption of a compiler to boost optimizations like caching/memoization. Angular, for its part, brings interesting capabilities around deferred component rendering.

Each of the frameworks offers code splitting and server-side rendering. It’s down to the developer to make the most of these tools, however. Like the core framework, it’s a matter of personal preference which tool gives you the best options for optimization.

Each framework is enterprise-grade and is deployed into production on large-scale apps. Even more than other factors, how the app handles large, interconnected designs and scaling to heavy use is heavily dependent on the specifics of your use case. They’re also all compatible with server-side frameworks for full-stack integration.

Which reactive framework is for you?

Svelte, Angular, React, and Vue all carry the banner of open source software high and bright. Prototyping is the best way to get a real feeling for a framework, but that is not always possible. As developers, there’s a tendency to use what we already know or what is already in use. Organizations (and developers) often seek the safest or most mainstream option, which in this set is undoubtedly React. Technology inertia, more so than superior technology, is the driving force behind React’s numerical dominance.

Even if you are only experimenting on weekends, it’s worth exploring other frameworks. The reactive frameworks profiled here are all well-established options with good ideas and large ecosystems. With experimentation, you may find you appreciate one framework’s DX over the others.

If you are in the market for a new reactive framework, and you have the time or organizational permission to experiment, Svelte is the most tempting option. I think Angular is one to keep an eye on. Vue is a solid option, although it is not very active currently. React retains its position as the top choice for developers who just need to get the job done.

(image/jpeg; 6.14 MB)

4 big changes WebAssembly developers need to know about 23 Apr 2025, 11:00 am

WebAssembly gives developers a whole new way to deliver applications to the web and beyond. Instead of writing solely in JavaScript, developers can write in various other languages, compile that to WebAssembly’s bytecode format, and run it in a sandboxed environment at near-machine speeds.

But Wasm (the popular abbreviation for WebAssembly) is also relatively new, and it’s still evolving in fits and starts. Key features that would massively expand Wasm’s power and usefulness aren’t there yet or must overcome major obstacles before they can be rolled out.

Here’s a rundown of the most anticipated features on the horizon for Wasm. I’ll explain what they do, why they matter, and what it’ll take to implement them. Some are still a ways off, while others are already available in provisional form.

Async

JavaScript, and web apps generally, rely heavily on asynchronous operations. Wasm’s specification, WASI, doesn’t yet natively support async I/O, but one of the key features promised for WASI 0.3 (originally planned for March 2025) is native support for async. With this addition, it will be possible to implement or call functions asynchronously.

Right now, any Wasm code that wants to use async behaviors must resort to a variety of awkward workarounds. Options include launching multiple Wasm instances or nonstandard approaches to allowing callbacks from a host that is dependent on a particular runtime’s way of doing things. The WASI 0.3 async plan lays out an official, blessed way to do async in the runtime and in WebAssembly itself. It also includes plans for the existing WASI 0.2 spec to have async support polyfilled or virtualized.

Not everything that async needs will be supported out of the gate. For instance, 0.3 won’t roll out a cancellation protocol for async tasks; that’s slated to be added in a later 0.3.x release.

Multi-threaded execution

Another major missing feature in Wasm and the WASI spec is support for threading. Wasm runtimes run single-threaded, which severely hobbles workloads that otherwise benefit from threading. Some existing workarounds are similar to how developers cope with the lack of async—for example, by spinning up multiple instances of the runtime and splitting workloads between them. But those solutions are bulky and inelegant.

Adding multithreaded execution to WebAssembly isn’t on the 0.3 roadmap, but there is an extant, approved proposal for doing it. Threads, called agents in the proposal, provide for shared memory between agents and atomic operations for memory access. Another proposal, shared-everything-threads, goes further with more advanced, forward-looking features, including compatibility with the proposed garbage collection mechanisms for Wasm.

Shared memory

Shared memory goes hand in hand with threading. The existing threading proposal brings with it a basic plan for how to share memory between components, but it’s mostly a way to share raw regions of linear memory. It’s not intended for sharing complex data objects.

A recent standard addition called multiple memories offers a more robust option. Memory segments can be defined in an application and isolated from each other, only sharing what needs to be shared. The next step is to allow better support for this feature in languages that compile to Wasm. Rust, for example, has explicit support for multiple memories. (One of the valid arguments for Rust being an excellent first choice for compiling to Wasm is how quickly and effectively it implements evolutionary changes to the Wasm runtime.)

Garbage collection

Many higher-level languages use automatic memory management, including JavaScript, C#, Java, Python, and Swift. Memory-handling details, like allocating memory and releasing it when it’s no longer needed, are left to the runtime.

Originally, the WASI spec and Wasm runtimes did not have native support for garbage collection or garbage-collected structures. Any language compiling to Wasm that wanted to support those features had to roll its own garbage collection. For instance, the Go and Python languages each include their own garbage-collected runtimes. The downside is the entire runtime must be included with every Wasm program. Adding native garbage collection to Wasm would allow ports of those language runtimes to use WASM’s own garbage collection mechanisms wherever possible.

The first iteration of garbage collection is now available in Wasm, but like the other proposals I’ve mentioned, it’s only enough to get things started. One important point for even this early iteration of Wasm GC is that you don’t pay for it if you don’t use it. Wasm programs that never use garbage collection will see no changes to their code.

Two things are coming next. The first is more robust features for Wasm GC, such as safe interaction with threads. Another is to have compilers that target Wasm and Wasm-ported runtimes utilize Wasm GC. For this to work, the compilers and runtimes in question will need to be modified. The Kotlin language, for instance, has introduced a Wasm implementation that uses Wasm GC, although it’s still considered an alpha-quality project.

The path to garbage collection reveals a pattern for other major Wasm features to come. Just having a feature defined formally in a spec or added to the available runtimes isn’t enough. All the languages that compile to Wasm must support the feature directly to make the most of it, and the timeline for that will vary with every language. That said, languages that can compile directly to Wasm (like Rust) will likely be quicker to add support, while those with a ported runtime (like Python or Go) will lag. Implementing support for new features is more straightforward for languages that compile directly to Wasm than for those that don’t.

(image/jpeg; 1.28 MB)

Vibe code or retire 22 Apr 2025, 11:00 am

I hope this title ticked you off a bit. Get mad. Pound the table. Shake your head and then listen. You need to learn to vibe code, or your career as a software developer will end.

Vibe coding is a cute term for using the latest generation of code generation tools that use large language models (LLMs, or “ChatGPT” for the unwashed). There are a number of tools (all bad), such as Cursor, Codeline, and Tabnine, and now the granddaddy of them all, GitHub Copilot, is getting in on the game. Most are Visual Studio Code forks or plugins.

I am not saying that you, with no game coding experience, can just GPT out a game in a week or two and get the president to talk about it on his Twitter and become a millionaire, though stranger things… no, precisely that has happened before. I am not saying the tools are good or will not produce worse code with more security holes. I am saying that if you do not learn how to use them relatively soon, you will have to retire from the industry.

Same old story

This sort of thing has happened before. Near the start of my career, I met a developer, who we’ll call Tom. Tom was an old-school hunt-and-peck programmer I worked with in one of my first jobs. Tom could produce a report in Visual Basic every six months. I don’t know what language he knew before, but he learned VB from books. When he went on vacation, I finished the first report we were supposed to do together in a week and a half. I saved him a part to do, and Tom hated me for it from then on.

How did I do it so fast? I used the IDE and googled (before Google was a verb, or any good, and long before it got bad). When we switched to Java, I learned the new language (in about a week) from the Internet and learned JBuilder simultaneously. Tom bought Bruce Eckle’s Thinking in Java and hunted and pecked on his keyboard and thumped the Eckle book as a bible. Tom didn’t want to learn new ways of doing things; he wanted to scorn the world for the way it worked.

On LinkedIn there are two kinds of people. There are the people who were hawking web3 a year ago and who are making wild claims about vibe coding today. Then there are all the Toms, whining about security and the art of coding and everything else. If you are one of the Toms, you need to set your alarm. Learning new ways of doing things is part of the job description.

You write business software. You are not an artist or code poet. No one cares about “software craftsmanship.” Your boss is right—learn the new way of doing things and code faster. Or just do what Tom did after meeting a 20-something keyboard clacker who knew how to google—retire. That’s it. Vibe code or retire.

Are LLMs really that good? Yes. Can you generate complete applications? Yes. Will the output suck? Only if you do. You see, this is not a panacea. The new “you can do it without coding” isn’t any different from the last generation of “you can do it without coding” tools. They sort of work without coding, unless you need a 2.0 or something complicated. Just like before (nothing has changed), they are just faster and better.

Give it time

Your first application using vibe coding tools will be terrible, and you will find it frustrating, and you will be bad at it. That is not the tool’s problem. It is yours. This is partly because these tools have the maturity and stability of JBuilder, Visual Basic 4.2 (maybe not that bad), and JavaScript 1.0. When you use them, they do aberrant things that annoy you, and you want to rage quit. Then, you learn to adapt, work with them, and start being faster. If you’re even a little good at coding, they’ll slow you down after the initial scaffolding. You’ll shake your head at some of its decisions, and then you’ll learn to make it do what you want—just like with every new development tool or technology before it.

When you get the hang of it, you will have new skills. I can code, but I never really cared to learn JavaScript. Nevertheless, I have written a rather complex, extensive JavaScript application in under three weeks. I could have done all of it without the LLM, eventually, but searching for the right APIs and learning the extension points would have taken time—even with Google (before it got bad). What I’ve vibe coded would have taken me like three months. I’m not 10x with AI, but I’m 5x my usual productivity.

So, do you want to stay gainfully employed during the upcoming hyperinflation? Cool. Here are a few tips:

Start with the free version of Cursor, Codeline, or whatever—then pay at least $40. Watching the thing draw is not going to keep you moving.
Pick a problem you would code if you “had time,” like the fellow who solved his Zoom storage problem.
Use Git frequently. It is going to do some idiotic things, and you should be prepared.
Have a design discussion with the LLM before you code and have it output in Markdown. You can usually reference this in your vibe coding IDE’s “rules” settings, so you don’t have to keep reminding the model what you’re coding. Or you can feed it the Markdown when it forgets.
Verify each step and revert or undo when the model does something stupid.
Stick with it even when it frustrates you. You’ll learn how to make it work.
Use Claude 3.7 Sonnet (currently) or maybe Gemini 2.5 Pro (or Experimental) as your LLM.

Vibe coding is coming. This is how we will all write code in the future. Start learning it now—or retire.

(image/jpeg; 22.61 MB)

Page processed in 0.6 seconds.