From an ops point of view as orgs get big enough, dev wraps around to being prod-like... in the sense that it has the property that there's going to be a lot of annoyed people whose time you're wasting if you break things.
You can take the approach of having more guard rails and controls to stop people breaking things but personally I prefer the "sandpit" approach, where you have accounts / environments where anything goes. Like, if anyone is allowed to complain it's broken, it's not sandpit anymore. That makes them an ok place to let agents loose for "whole system" work.
I see tools like this as a sort of alternative / workaround.
But particularly for devops / systems focused work, you lose too much "test fidelity" if you're not integrating against real services / cloud.
Yeah, working on the landing page. Feel free to ask any other questions!
Interesting idea, few things:
- The website tells less than your comment here. I want to try but have no idea how destructive it can be.
- You need to add / mention how to do things in the RO mode only.
- Always explain destructive actions.
Few weeks ago I had to debug K8S on the GCP GDC metal, Claude Code helped me tons, but... I had to recreate whole cluster next day because agent ran too fast deleted things it should not delete or at least tell me the full impact. So some harness would be nice.
- The kubernetes example is exactly what this is built for, giving AI access is dangerous but there is always a chance of it messing something. Thanks for the comment!
Note: nothing against fluid.sh, I am struggling to figure out something to build.
In a complex project the hard parts about software are harder than the hard parts about the domain.
I've seen the type of code electrical engineers write (at least as hard a domain as software). They can write code, but it isn't good.
The ability to acquire domain knowledge quickly however, isn't exactly the same as the ability to develop complex software.
The best part, of course, is that this mostly works, most of the time, for most busineses.
Now, the same domain experts -who still cannot code- will do the exact same thing, but AI will make the spreadsheet more stable (actual data modelling), more resilient (backup infra), more powerful (connect from/to anything), more ergonomic (actual views/UI), and generally more easy to iterate upon (constructive yet adversarial approach to conflicting change requests).
Hallucinations sure make spreadsheets nice and stable.
From the view you describe, it seems AI just lets you experiment faster, when all you want to do is experiment. You find product market fit easier, you empower designers more, etc. Much easier to iterate and find easy wins from alternative designs - as long as your fundamentals work!
Only problem is that you are experimenting in public, so the massive wave of new AI generated features come to the public from everywhere at once. Hence the widespread backlash.
Not to mention, the core job function when you are experimenting is different from what defines a lot of hard technical progress: creating new technologies, or foundational work that others build on, is naturally harder and slower than building e.g. CRUD services on top of an existing stack. Deep domain expertise matters for selling, deep programming expertise matters for stability. I don't know, curious where the line will end up getting drawn.
There’s a requisite curiosity necessary to cross the discomfort boundary into how the sausage is made.
Programming is not something you can teach to people who are not interested in it in the first place. This is why campaigns like "Learn to code" are doomed to fail.
Whereas (good) programmers strive to understand the domain of whatever problem they're solving. They're comfortable with the unknown, and know how to ask the right questions and gather requirements. They might not become domain experts, but can certainly learn enough to write software within that domain.
Generative "AI" tools can now certainly help domain experts turn their requirements into software without learning how to program, but the tech is not there yet to make them entirely self-sufficient.
So we'll continue to need both roles collaborating as they always have for quite a while still.
My belief is that engineers should be the prime candidates to be learning the domain, because it can positively influence product development. There’s too many layers between engineers and the the domain IME
The beauty of LLMs is that they can quickly gather and distill the knowledge on both sides of that relationship.
Web dev is low entry barrier and most web devs don’t need a very deep knowledge base.
Embedded, low level language, using optimizations of the OS / hardware require MUCH more specialized knowledge. Most of the 4 year undergraduate program for Computer Science self selects for mathematics inclined students who then learn how to read and learn advanced mathematics / programming concepts.
There’s nothing that is a hard limit to prevent domain expert autodidacts from picking up programming, but the deeper the programming knowledge, the more the distribution curves of programmers / non-programmers will be able to succeed.
Non programmers are more likely to be flexible to find less programming-specific methods to solve the overall problem, which I very much welcome. But I think LLM-based app development mostly just democratizes the entry into programming.
"Are there more or less examples of successful companies in a given domain that leverage software to increase productivity than software companies which find success in said domain?"
But an answer to your question would be Capital One.
In what domains have you had experience taking non programmers with domain knowledge and making them programmers?
Sure, I could go and create an accounting app - or a clinical trial recruitment app - as a basic clone of what I've already created. And I might even make it better for some niche. But even if I know what that product system needs, I still need to find someone with the relationships to get in the door.
The trick is - you don't need an idea man for a non-technical founder. You really need someone with a rolodex and a problem.
My codebase is full of one-offs that slowly but surely converge towards cohesive/well-defined/reusable capabilities based on ‘real’ needs.
I’m now starting to pitch consulting to a niche to see what sticks. If the dynamic from the office holds (as I help them, capabilities compound) then I’ll eventually find something to call ‘a product’.
He kept ranting about what a b*tch of a problem that was, every time we went out drinking, and one day, something got into me, and thought there must be some software that can help with this.
Surely there was, and I set up a server with an online web UI where every employee could put in when they were able to work, and the software figured out how to assign timeslots to cover requirements.
I thought it was a nice exercise for me in learning to admininster a linux server, but when I showed it to my friend, he looked me in the eye and told me I a saved him a day of work every week, and called me a wizard :D
It occured to me, how naturally part of the programming profession is to make things in fixed amounts of time, that turn difficult and time consuming tasks a human needed to do into something that essentially just happens on its own.
I mean it in terms of owning the solution to a problem, being accountable/responsible for something working e2e not just the software or even the product - the service/experience of the customer that makes them want to give you money. Once you put on another hat - guess what - you'd probably be the star of some operations team or a great supervisor of some department. You would automate everything around you to a point others think you're the most capable person they've ever seen in that role.
It’s really liberating. Instead of saying “gosh I wish there was an app that…” I just make the app and use it and move on.
There are an infinite amount of problems to solve.
Deciding whether they’re worth solving is the hard part.
Outside of tech companies, I think this is extremely common.
Don’t get me wrong, I have found uses for various AI tools. But nothing consistent and daily yet, aside from AI audio repair tools and that’s not really the same thing.
They'll work for hours and end up with $4 of gold
These are the pets.com of the current bubble, and we'll be flooded by them before the damn thing finally pops.
Or maybe ask yourself what do you like to do outside of work? maybe build an app or claude skill to help with that.
If you like to cook, maybe try building a recipe manager for yourself. I set up a repo to store all of my recipes in cooklang (similar to markdown), and set up claude skills to find/create/evaluate new recipes.
Building the toy apps might help you come up with ideas for larger things too.
selling it is the hard part, nothing new there
So the lifecycle of an app would be:
1) Create your game/quiz/whatever app.
2) Pay a successful app $x per install, and get a bunch of app installs.
3) Put all sorts of scammy "get extra in game perks if you refer your friends" to try to become viral.
4) Hope to become big enough that people start finding you without having to pay for ads.
5) Sell ads to other facebook app startups to generate installs for them.
It was a completely circular economy. There was not product or income source other than the next layer of the pyramid.
It didn't last long.
Fast forward 18 years, and the company is going strong with millions of subscribers and distributing Oscar winning films such as Demi Moore’s The Substance.
This is not even AI - it's pre-AI, and everyone has continued to try to create things that other people can use as a dependency, just on a much higher pace.
I've found writing simulations that my childhood brain would have LOVED to see run fun and fulfilling.
Also what does society need? Smart workers and people who believe in the system... so where does that leave us? We need to make something that would better enable children to want to grow up in the world and participate. Otherwise were doing nothing of value and in a death spiral
AI is a product in search of a killer feature
First AGI was anyday going to come. Gpt5 had showed intelligence apparently
Then got started adult chat with paying customers
To summarize: Everyone wants to automate stuff. Most people do not want to touch boring, large problems.
I am now so deep into the rabbit hole that I have made a version that runs entirely in the browser and an ESP32 version. I have now also taken the printer apart to find that the built in BLE is an external module and I could interface directly with the printer by replacing it with my own custom PCB...
[1] https://sschueller.github.io/posts/making-a-label-printer-work-under-linux-using-agentic-ai/
I think we see an aspect of this here, a lot of things we took for granted are changing, shared assumptions are being challenged and it's a period we're all relearning new things. To some extent spending too much time diving on the current iteration of AI tooling might be for nothing if gets invalidated by another sudden jump.
With all these new tools people are building, I can't help but feel they are building foundations on moving soil.
After the war the US created extra demand in the form of consumerism.
China is creating extra demand for infrastructure overcapacity with its belt and road initiative.
I wouldnt underestimate the abililty of the country to creatively create demand to counter oversupply.
I really like this idea. I do a lot of kubernetes ops with workloads I'm unfamiliar with (and not directly responsible for) and often give claude read access in order to help me debug things, including with things like a grafana skill in order to access the same monitoring tools humans have. It's saved me dozens of hours in the last months - and my job is significantly less frustrating now.
Your method of creating ansible playbooks makes _tons_ of sense for this kind of work. I typically create documentation (with claude) for things after I've worked through them (with claude) but playbooks is a very, very clever move.
I would say something similar but as an auditable, controllable kubernetes operator would be pretty welcome.
So you really need customers to react. And this isn't theoretical - people have already lost their jobs and there's really, really good people in the market available right now.
Scary? A little but it's doing great. Not entirely sure why a specialized tool is needed when the general purpose CLI is working.
One does not need a new/separate tool to do any of this, just include it in your agents instructions.
Nowhere in your response did you mention security.
go install github.com/aspectrr/fluid.sh/fluid/cmd/fluid@latest
!
This lets AI work on cloned production sandboxes vs running on production instances. Yes you can sandbox Claude Code on a production box, but it cannot test changes like it would for production-breaking changes. Sandboxes give AI this flexibility allowing it to safely test changes and reproduce things via IaC like Ansible playbooks.
I'm already using LLM to generate things and I'm not sure what this adds. The Demo isn't really doing it for me but maybe I'm wrong target for it. (What is running on that server? You don't know. Build your cattle properly!)
Maybe this is better for one man band devs trying to get something running without caring beyond, it's running.
and on the website: https://fluid.sh
But fluid lets AI investigate, explore, run commands, and edit files in a production-cloned sandbox. LLMs are great at writing IaC, but the LLMs won't get the right context from just generating an Ansible Playbook. They need a place to run commands safely and test changes before writing the IaC. Much like a human, hence the sandbox.
A better approach is to have AI understand how prod is built and make the changes there instead of having AI inspect it and figure out how to apply one off changes.
Models are already very good at writing IaaC.
I don’t remember where I got this link from
First I’m personally never going to create infrastructure in the console. I’m going to use IAC from the get go. That means I can reproduce my infra on another account easily.
Second if I did come across an environment where this was already the case, there are tools for both Terraform and CloudFormation where you can reverse your infra to reproducible IAC.
After that, let Claude go wild in my sandbox account with a reasonably scoped IAM role with temporary credentials
Sorry, that last part is absolutely not the case from my experience. IaC also uses the API to inquire about the infrastructure, and there are existing import/export tools around it, so I’m not exactly sure what you are gaining by insisting on abandoning it. IaC also has the benefit of being reusable and commitable.
It's largely because every devops situation is a snowflake and humans love to generalize. Turns out we don't all have the same problems. I haven't seen a startup that's been successful in devops at a level above the HCL / yaml
Don't do the same as everyone!
For safety...
here... Just curl this script and execute it :)How does the Ansible export work? Do the agents hack around inside the VM and then write a playbook from memory, or are all changes made via Ansible?
If Ansible playbooks are the artifact, what does features does Fluid offer over just having agents iterate on an Ansible codebase and having Ansible drive provisioning?
Next month - 'Sorry I caused a $200,000 bill...'
"install kimi 2.5 on a 4x mi300x vm and connect the endpoint to opencode, shut it down in 4 hours"
We're getting close.
it's clear infra level decisions are well beyond what LLMs / agents are capable of today, this is area is too high risk, devops is slow to adopt new tooling because of its role and nature
this is still devops. we use cloud-init to setup the vm.
i run the underlying hardware infrastructure and we've automated the provisioning such that we have an api that can start/stop compute at will. even bare metal.
the point of this is that the current $/token model is awful, especially if you're using a lot of tokens. it should be $/minute. pay for what you use.
1. No, I commented, it is not even possible to downvote a reply to your own comment, seems other people must disagree with what you said or how you said it
2. It's against HN guidelines to talk about your downvotes, especially making claims about who has done it
The most likely reason for your downvotes is promoting your own (incomplete) project under someone else's. What did you hope to bring to the conversation?
what could go wrong..
> Safety. I didn't want CC to SSH into a prod machine
The call to action:
> curl -fsSL https://fluid.sh/install.sh | bash
The reason this is ironic: https://x.com/sheeki03/status/2018382483465867444
What does that mean?
Fluid is a terminal agent that do work on production infrastructure like VMs/K8s cluster/etc. by making sandbox clones of the infrastructure for AI agents to work on, allowing the agents to run commands, test connections, edit files, and then generate Infra-as-code like an Ansible Playbook to be applied on production.
Why not just use an LLM to generate IaC?
LLMs are great at generating Terraform, OpenTofu, Ansible, etc. but bad at guessing how production systems work. By giving access to a clone of the infrastructure, agents can explore, run commands, test things before writing the IaC, giving them better context and a place to test ideas and changes before deploying.
I got the idea after seeing how much Claude Code has helped me work on code, I thought "I wish there was something like that for infrastructure", and here we are.
Why not just provide tools, skills, MCP server to Claude Code?
Mainly safety. I didn't want CC to SSH into a prod machine from where it is running locally (real problem!). I wanted to lock down the tools it can run to be only on sandboxes while also giving it autonomy to create sandboxes and not have access to anything else.
Fluid gives access to a live output of commands run (it's pretty cool) and does this by ephemeral SSH Certificates. Fluid gives tools for creating IaC and requires human approval for creating sandboxes on hosts with low memory/CPU and for accessing the internet or installing packages.
I greatly appreciate any feedback or thoughts you have, and I hope you get the chance to try out Fluid!
What’s the differentiator?
This is already the modern way to run infra. If your running simple apps, why are you even spinning up vms? Container running platforms make this so easy.
And you thought the costs for burning tokens was high... let's amp it up by spinning up a bunch of cloud infra and let the agents fumble about.
DevOps is my gig, I use agents extensively, I would never do this. This is so wasteful
Lately I have been setting up Pulumi stacks in ephemeral AWS accounts managed by AWS Organizations and working on a Kubernetes cluster locally with Tilt. So far, Claude is pretty good with those things. It seems to have pretty good knowledge of Pulumi, basic knowledge of Tilt, and good knowledge of Kubernetes. It’s a little out of date on some things and needs reminding to RTFM, but it can get a lot done by itself. If it were a real point of friction, a cheat sheet (sorry, “skill”) would be enough to solve the majority of issues.
The example you provide seems to be more along the lines of SSHing into remote boxes and setting things up manually. That’s not really helpful when you want to work on repeatable infra. You try to distinguish yourself from generating Terraform etc., but actually that’s what’s valuable in my experience.