Verification and comprehension are not mutually exclusive. You might be right that instead of starting with comprehension it may make sense to *start with* verification. And then whether to proceed to comprehension is a judgment call based on the cost of failure, our understanding of the tool's strengths and weaknesses as well as past experience.
Most of the examples provided don't involve non-determinism. I think it's the non-determinism that often disturbs our intuition, especially in those cases where it is incidental, not essential. I like some creativity when writing an essay, but not when counting. And, with a compiler, if we were concerned enough, we could understand how it works and why it works a certain way.
I do think that, if we are to leverage the power of AI-generated code without blind trust, we may need to think probabilistically in cases that make us uncomfortable.
Thanks for your insightful comments. Joe. The point I was trying to make is that I basically agree with you. But a non-deterministic system can and does make deterministic output. In fact tool use is effectively the bridge between non-determinism and determinism. And we can test that output very effectively. No probability needed -- it runs until we get the results we're looking for.
Ah. That makes sense. I was thinking about both those cases that are "virtually deterministic" (the ones you're talking about) as well as those that aren't but where we still have some sense of what we expect and where probability can be useful.
I'm actually working on a problem like that right now -- using an LLM to build adapters, It gets to a certain level then fails, and in fact can not proceed further. Is that a fail? No, because I can test those failures and build tool calls to handle them using code. This is why observability is so important. Impossible work with LLMs without it.
Verification and comprehension are not mutually exclusive. You might be right that instead of starting with comprehension it may make sense to *start with* verification. And then whether to proceed to comprehension is a judgment call based on the cost of failure, our understanding of the tool's strengths and weaknesses as well as past experience.
Most of the examples provided don't involve non-determinism. I think it's the non-determinism that often disturbs our intuition, especially in those cases where it is incidental, not essential. I like some creativity when writing an essay, but not when counting. And, with a compiler, if we were concerned enough, we could understand how it works and why it works a certain way.
I do think that, if we are to leverage the power of AI-generated code without blind trust, we may need to think probabilistically in cases that make us uncomfortable.
Thanks for your insightful comments. Joe. The point I was trying to make is that I basically agree with you. But a non-deterministic system can and does make deterministic output. In fact tool use is effectively the bridge between non-determinism and determinism. And we can test that output very effectively. No probability needed -- it runs until we get the results we're looking for.
Ah. That makes sense. I was thinking about both those cases that are "virtually deterministic" (the ones you're talking about) as well as those that aren't but where we still have some sense of what we expect and where probability can be useful.
I'm actually working on a problem like that right now -- using an LLM to build adapters, It gets to a certain level then fails, and in fact can not proceed further. Is that a fail? No, because I can test those failures and build tool calls to handle them using code. This is why observability is so important. Impossible work with LLMs without it.
A reader has kindly pointed out there are a few incorrect links in the article. I apologize for that, will correct them today.
Links should be fixed now. Thanks for your patience.