As someone who used to write detailed software specifications for a living, I can see why AI is seductive: the idea that someone, or something, might take my work and continue reading past the abstract.
@mvsamuel
Programming languages person focused on software systems problems. Previously, first frontend engineer on Google Calendar, and was a security engineer who worked on the industrial-strength Mad Libs undergirding Gmail. Pro-trans-rights is pro-family.
As someone who used to write detailed software specifications for a living, I can see why AI is seductive: the idea that someone, or something, might take my work and continue reading past the abstract.
Look upon my works, ye Competent, and despair!
Who's got two thumbs and is failing Upwards?
What stands out to me is the cast to (char*) so that it can get byte indexing in the arithmetic part.
Idk if you're always allowed to cast to (char*) the way you can for (void*).
The LLVM opaque pointer stuff is trying to avoid confusion from these kinds of casts, but iiuc that's a pragmatic move.
// That translates to Java like this which is just accumulating // onto a bog standard StringBuilder. StringBuilder collector_445 = SafeHtmlBuilder.newCollector(); collector_445.append("<ul>\n"); for (UrlAndText item_446 : items_442) { collector_445.append(" <li><a href=\""); collector_445.append(secure_composition.html.HtmlAttributeEscaper.instance.applySafeUrl(secure_composition.html.FilterUrlPrefixEscaper.instance.applyString(item_446.url))); collector_445.append("\">"); collector_445.append(secure_composition.html.HtmlPcdataEscaper.instance.applyString(item_446.label)); collector_445.append("</a></li>\n"); } collector_445.append("</ul>\n"); return SafeHtmlBuilder.fromCollector(collector_445);
And that all translates to Java like the below.
Oh, I should mention that the subtlety in the HTML generation above is that untrusted `javascript:` URLs need to be kept away from attributes like href; simple HTML auto-escaping isn't sufficient.
// html tag connects to a SafeHtmlBuilder library // PHP-style embedded statements allow for conditions/repetition. html""" " "<ul> " : for (let item of items) { " <li><a href=${item.url}>${item.label}</a></li> " : } "</ul> " // Margin characters control how lexing happens: // " - character data follows // : - embedded statement fragment tokens follow // ~ - like " but no LF at end of line
// That is equivalent to the below. // The runtime semantics of SafeHtmlBuilder are that it analyzes // "fixed" strings to pick appropriate escapers for untrusted, // interpolated expressions by propagating contexts across the // fixed parts. do { let accumulator_0 = new SafeHtmlBuilder(); accumulator_0.appendFixed("<ul>\n"); for (let item of items) { accumulator_0.appendFixed(" <li><a href="); accumulator_0.append(item.url); accumulator_0.appendFixed(">"); accumulator_0.append(item.label); accumulator_0.appendFixed("</a></li>\n"); } accumulator_0.appendFixed("</ul>\n"); accumulator_0.accumulated }
// Escaper picking is based on transition tables which fall into a // side-effect free subset of the language. // We can compile-time introspect over SafeHtmlBuilder to predict // its context properties at points in the CFG graph. // When SafeHtmlBuilder is in a URL attribute, its HTML state machine // is also driving a URL state machine mediated by an HTML codec. do { let accumulator_0 = new SafeHtmlBuilder(); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed("<ul>\n"); for (let item of items) { // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed(" <li><a href="); // assert(accumulator_0.context == HtmlContext(AttrValue, Url, Dq, 0) && // accumulator_0.delegate?.context == UrlContext(0) accumulator_0.append(item.url); accumulator_0.appendFixed(">"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.append(item.label); accumulator_0.appendFixed("</a></li>\n"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) } // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed("</ul>\n"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.accumulated }
// Finally we inline escaper decisions, and erase the builder object, // so analysis doesn't have to happen at runtime. do { let collector_0 = SafeHtmlBuilder.newCollector(); collector_0.appendFixed("<ul>\n"); for (let item of items) { collector_0.appendFixed(" <li><a href=\""); // Quotes added by SM collector_0.appendString( HtmlAttributeEscaper.instance.applySafeUrl( FilterUrlPrefixEscaper.instance.applyString(item.url) ) ); collector_0.appendFixed("\">"); collector_0.appendString( HtmlPcdataEscaper.instance.applyString(item.label) ); collector_0.appendFixed("</a></li>\n"); } collector_0.appendFixed("</ul>\n"); SafeHtmlBuilder.fromCollector(collector_0) }
To tease some work on letting security engineers specify how untrusted and trusted values combine, I hacked in some meta-programming.
The goal is to allow enough MP so that repetitive runtime analysis can be erased and you're just appending strings.
I think meta-programming is super important for cross-translating languages.
RTTI and reflection are used at program boundaries, e.g. converting values into JSON so they can pass by copy to other programs written in other languages.
But they're also semantic tarpits, so some MP simplifies interop.
blog.computationalcomplexity.org/2026/03/tony...
it was going to happen, death comes for us all. but man. what a legend
Co-expressions are kind of wild if you're not familiar with them.
www.cs.tufts.edu/~nr/cs257/ar...
I guess the reason it seems similar to me is scheduling.
unquote-* allow evaluation of sub-expressions in a context where evaluation is typically delayed
And yield* here is explicitly scheduling the coro to actually produce the subtrees it emplaces iiuc.
Very cool that the types are lining up.
Or just unquote
Is yield* used here as unquote-splicing?
Veni, vidi, vibi
US poster-children for hopium react as if incorrigible know-nothing liar is trustworthy.
People post commits that net remove code and that's great, but I wish there were a convention that allowed separately counting test and prod LoC.
A PR that removes some unnecessary prod code but adds a lot of tests (improving coverage), to my mind is worth celebrating on two counts.
And they've been doing that since 1966.
en.wikipedia.org/wiki/ELIZA_e...
I want everyone involved to lose their shirts.
But this seems like people trading on contracts/derivatives assuming they can value those without understanding, in detail, the contract/derivation.
The print, fine or not, is the valuation basis.
On multiple levels, you're just doing it wrong.
Actually, I guess he's probably imagining a gaggle of agents each with their own agendae (metaphorically) and so they can both be conspiring and confusable.
Which is probably fair.
When Mark talks like that he's usually leading into some fine-grained distinction: side- vs covert- channels, etc.
Is that distinction important in the context of agentic security?
If confused deputy problems are the major problems, distinctions relevant to conspiring sub-processes seem OoB.
I've heard that but don't understand how, for textual content, leaving a gap in column zero helps.
Yeah. I think the neighbour part of those stories rests heavily on social norms. Neighbours are confusable deputies but confusable within the bounds of social norms.
Text with two indented, italicized paragraphs both starting with "In an emergency" in between three other paragraphs: Marc Stiegler has identified the root cause of the problem; our devices do not support aspects of sharing that we rely on in the physical world. These aspects can be illustrated with two stories. In an emergency, Marc asked me to park his car in my garage. I couldnβt do it, so I asked my neighbor to do it for me and told her to get the garage key from my son. I doubt that anyone would think twice about this story. The second story is in the computer domain. In an emergency, Marc asked me to copy a file from his computer to mine. I couldnβt do it, so I asked my neighbor to do it for me and told her to get access to my computer from my son. People often find this second story so preposterous that they laugh.
The Six Aspect of Sharing Figure 1 illustrates the six aspects of sharing that we rely on in the physical world. That text is around diagram with six images, text labels and arrows. Light blue arrows connect each of four in a right to left chain, and black arrows from the bottom left "Cross domain", extend to those. Another light blue arrow goes from an accountant, labeled "Accountable" to the right of the top four. The top four are, from right to left (in order of the arrows): - Recomposable. A hand giving keys to another waiting hand with a shadowy third hand perhaps giving the keys earlier. - Chained. The same image of two hands but without the third. - Attenuated. The same image of two hands. - Dynamic. A man in a suit and holding a briefcase running. The Cross domain image in the bottom left is an open gate in a possibly electrified fence like you might find around a farm field. Dynamic above a man in a suit running.
Relevant to agentic access control.
I was reminded of Alan Karp's & Marc Stiegler's litmus test for usable access control, and Marc's six aspects of sharing.
alanhkarp.com/publications...
A portion of Wikipedia's ASCII table showing hex row markers on the left from 4x to 7x. The rest of each row is 16 boxes. Capital A starts at 0x41, the second cell of the 4x line. Capital letters continue in sequence until capital O at the far right of that row. Then P is at the wraparound position: column 0 of row 5x and is followed in order until 'Z' at the 5th from last column of row 5x. Rows 6x and 7x repeat the same pattern with lower case letters.
A portion of Wikipedia's EBCDIC table with hex row headers 8x, 9x, and Ax and 16 columns. This portion includes the lower case Latin letters a-z. The second through tenth columns of row 8x has a-i in order. Then the last six columns are not occupied by letters; they're blank except the +/- symbol in the rightmost cell. Row 9x has j-r in the second through tenth columns and the other cells are blank. Row Ax finishes up the letters with s-z but 's' does not start in the second column. Tilde (~) is there and s is in the third column.
ASCII designers: Every letter has a numeric value one greater than any letter before it.
EBCDIC designers: Rectangles are neat.
Even if you don't find the Totenkopf tattoo disqualifying, Platner's campaign team abandoned him.
This doesn't happen to a lot of pols.
The people closest to him and most aware of his politics do not believe he is as he presented himself.
Periodic reminder: Meta's engineering culture 11 years ago.
The company was founded in 2004, so about midway between its founding and now.
www.bitdefender.com/en-us/blog/h...
17. Test and Validation Matrix A conforming implementation should include tests that cover the behaviors defined in this specification.
The thing that maxed my eyebrow raisage is section 17's list of topics that a conformance test suite should cover, but without any specification of desired system behaviour.
This is spec jargon word salad.
Is there a bit of "we don't need to define configuration parsing; just prepend your config choices to the prompt."
If the spec includes "if X is desired" it is less a description of a configurable program and more a lens through which instances with hard-coded config choices can be projected.
Don't orchestrators tend to have the property that they call out, things don't call into them.
That seems kind of the sweet spot for this sort of thing. You don't need stable, backwards compatible APIs when nothing calls into you, when you're not a peer in a larger system.
(ii) Addition of Try-Catch Statements: A recurring pattern observed in the code generated by the LLMs, when prompted with techniques designed to include security considerations, is the addition of try-catch statements. Specifically, in code generated through the use of the naive-secure variant of zero-shot prompts, these try-catch blocks were added as a standalone security measure, without any other security enhancements. These instances typically occurred when the models could not identify vulnerabilities or weaknesses in the code apart from potential run-time errors. Consequently, they resorted to including rudimentary security provisions through these blocks. While these try-catch statements
(vii) Modification of Method Names: Quite commonly, when employing zero-shot prompt variations, we observe a pattern where method names in the generated code are prefixed with the term βsecure.β For instance, we came across method names like secure_ping(), secure_memory_allocation, secure_upload_file, and the like. However, it is noteworthy that in many cases, the actual implementation within these methods remains unaltered, despite the suggestive βsecureβ prefixes in the method names. This tendency is
It's actually a pretty funny paper too in a dry way.
There's a section on things models do in response to prompting that doesn't provide specific guidance.
Agents not beating the enthusiastic junior dev rap.
Highlighted section of several article paragraphs: the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models Text follows: 7.2 Prominent Security Weaknesses in LLM-Generated Code In this section, we delve into our findings through the lens of the key CWEs detected by Bandit which are highlighted in Table 5, discussing the challenges they pose to the task of generating secure code. CWE-78. CWE-78 stands out as one of the most frequently recorded weaknesses across the code generated by all three LLMs. It manifests when an application incorporates external input to construct an OS command but fails to adequately neutralize special characters or elements within the command. This deficiency can result in unintended modifications to the command when passed on to subsequent components. In the LLM-generated code, this weakness often materializes in the form of an OS command initiating a process with a partial executable path or when a subprocess.run() command is invoked using user-provided input. Examining Table 5, it is evident that the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models. This underscores the necessity for meticulous crafting of prompts, particularly for coding tasks involving subprocess calls or other OS commands reliant on external input.
Our analysis reaffirms the prevalence of security weaknesses in code generated by LLMs when prompted with NL instructions, with significant challenges stemming from CWE-78, CWE-259, CWE-94, and CWE-330. Among the prompting techniques investigated, RCI, a refinement-based approach, exhibited notable effectiveness in preventing security weaknesses in LLM-generated code. Particularly noteworthy was its performance with GPT-4, where it reduced the average
There are some stark conclusions that can help PL&seceng people like me prioritize.
In correspondence, the first author predicts that PL&tool changes that reduce the token count of standing orders and error messages (JSON logging modes), will improve the effectiveness of multi-step generation.