Mike Samuel 🟣's Avatar

Mike Samuel 🟣

@mvsamuel

Programming languages person focused on software systems problems. Previously, first frontend engineer on Google Calendar, and was a security engineer who worked on the industrial-strength Mad Libs undergirding Gmail. Pro-trans-rights is pro-family.

771
Followers
820
Following
1,881
Posts
21.08.2023
Joined
Posts Following

Latest posts by Mike Samuel 🟣 @mvsamuel

As someone who used to write detailed software specifications for a living, I can see why AI is seductive: the idea that someone, or something, might take my work and continue reading past the abstract.

10.03.2026 20:19 πŸ‘ 9 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0

Look upon my works, ye Competent, and despair!

Who's got two thumbs and is failing Upwards?

10.03.2026 19:06 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

What stands out to me is the cast to (char*) so that it can get byte indexing in the arithmetic part.

Idk if you're always allowed to cast to (char*) the way you can for (void*).

The LLVM opaque pointer stuff is trying to avoid confusion from these kinds of casts, but iiuc that's a pragmatic move.

10.03.2026 18:55 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
// That translates to Java like this which is just accumulating
// onto a bog standard StringBuilder.
StringBuilder collector_445 = SafeHtmlBuilder.newCollector();
collector_445.append("<ul>\n");
for (UrlAndText item_446 : items_442) {
    collector_445.append("  <li><a href=\"");
    collector_445.append(secure_composition.html.HtmlAttributeEscaper.instance.applySafeUrl(secure_composition.html.FilterUrlPrefixEscaper.instance.applyString(item_446.url)));
    collector_445.append("\">");
    collector_445.append(secure_composition.html.HtmlPcdataEscaper.instance.applyString(item_446.label));
    collector_445.append("</a></li>\n");
}
collector_445.append("</ul>\n");
return SafeHtmlBuilder.fromCollector(collector_445);

// That translates to Java like this which is just accumulating // onto a bog standard StringBuilder. StringBuilder collector_445 = SafeHtmlBuilder.newCollector(); collector_445.append("<ul>\n"); for (UrlAndText item_446 : items_442) { collector_445.append(" <li><a href=\""); collector_445.append(secure_composition.html.HtmlAttributeEscaper.instance.applySafeUrl(secure_composition.html.FilterUrlPrefixEscaper.instance.applyString(item_446.url))); collector_445.append("\">"); collector_445.append(secure_composition.html.HtmlPcdataEscaper.instance.applyString(item_446.label)); collector_445.append("</a></li>\n"); } collector_445.append("</ul>\n"); return SafeHtmlBuilder.fromCollector(collector_445);

And that all translates to Java like the below.

Oh, I should mention that the subtlety in the HTML generation above is that untrusted `javascript:` URLs need to be kept away from attributes like href; simple HTML auto-escaping isn't sufficient.

10.03.2026 18:13 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
// html tag connects to a SafeHtmlBuilder library
// PHP-style embedded statements allow for conditions/repetition.
html"""                                                                       "
  "<ul>                                                                       "
  : for (let item of items) {
  "  <li><a href=${item.url}>${item.label}</a></li>                           "
  : }
  "</ul>                                                                      "

// Margin characters control how lexing happens:
// " - character data follows
// : - embedded statement fragment tokens follow
// ~ - like " but no LF at end of line

// html tag connects to a SafeHtmlBuilder library // PHP-style embedded statements allow for conditions/repetition. html""" " "<ul> " : for (let item of items) { " <li><a href=${item.url}>${item.label}</a></li> " : } "</ul> " // Margin characters control how lexing happens: // " - character data follows // : - embedded statement fragment tokens follow // ~ - like " but no LF at end of line

// That is equivalent to the below.
// The runtime semantics of SafeHtmlBuilder are that it analyzes
// "fixed" strings to pick appropriate escapers for untrusted,
// interpolated expressions by propagating contexts across the
// fixed parts.
do {
  let accumulator_0 = new SafeHtmlBuilder();
  accumulator_0.appendFixed("<ul>\n");
  for (let item of items) {
    accumulator_0.appendFixed("  <li><a href=");
    accumulator_0.append(item.url);
    accumulator_0.appendFixed(">");
    accumulator_0.append(item.label);
    accumulator_0.appendFixed("</a></li>\n");
  }
  accumulator_0.appendFixed("</ul>\n");
  accumulator_0.accumulated
}

// That is equivalent to the below. // The runtime semantics of SafeHtmlBuilder are that it analyzes // "fixed" strings to pick appropriate escapers for untrusted, // interpolated expressions by propagating contexts across the // fixed parts. do { let accumulator_0 = new SafeHtmlBuilder(); accumulator_0.appendFixed("<ul>\n"); for (let item of items) { accumulator_0.appendFixed(" <li><a href="); accumulator_0.append(item.url); accumulator_0.appendFixed(">"); accumulator_0.append(item.label); accumulator_0.appendFixed("</a></li>\n"); } accumulator_0.appendFixed("</ul>\n"); accumulator_0.accumulated }

// Escaper picking is based on transition tables which fall into a
// side-effect free subset of the language.
// We can compile-time introspect over SafeHtmlBuilder to predict
// its context properties at points in the CFG graph.
// When SafeHtmlBuilder is in a URL attribute, its HTML state machine
// is also driving a URL state machine mediated by an HTML codec.
do {
  let accumulator_0 = new SafeHtmlBuilder();
  // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
  accumulator_0.appendFixed("<ul>\n");
  for (let item of items) {
    // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
    accumulator_0.appendFixed("  <li><a href=");
    // assert(accumulator_0.context == HtmlContext(AttrValue, Url, Dq, 0) &&
    //        accumulator_0.delegate?.context == UrlContext(0)
    accumulator_0.append(item.url);
    accumulator_0.appendFixed(">");
    // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
    accumulator_0.append(item.label);
    accumulator_0.appendFixed("</a></li>\n");
    // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
  }
  // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
  accumulator_0.appendFixed("</ul>\n");
  // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0)
  accumulator_0.accumulated
}

// Escaper picking is based on transition tables which fall into a // side-effect free subset of the language. // We can compile-time introspect over SafeHtmlBuilder to predict // its context properties at points in the CFG graph. // When SafeHtmlBuilder is in a URL attribute, its HTML state machine // is also driving a URL state machine mediated by an HTML codec. do { let accumulator_0 = new SafeHtmlBuilder(); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed("<ul>\n"); for (let item of items) { // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed(" <li><a href="); // assert(accumulator_0.context == HtmlContext(AttrValue, Url, Dq, 0) && // accumulator_0.delegate?.context == UrlContext(0) accumulator_0.append(item.url); accumulator_0.appendFixed(">"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.append(item.label); accumulator_0.appendFixed("</a></li>\n"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) } // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.appendFixed("</ul>\n"); // assert(accumulator_0.context == HtmlContext(0, 0, 0, 0) accumulator_0.accumulated }

// Finally we inline escaper decisions, and erase the builder object,
// so analysis doesn't have to happen at runtime.
do {
  let collector_0 = SafeHtmlBuilder.newCollector();
  collector_0.appendFixed("<ul>\n");
  for (let item of items) {
    collector_0.appendFixed("  <li><a href=\""); // Quotes added by SM
    collector_0.appendString(
      HtmlAttributeEscaper.instance.applySafeUrl(
        FilterUrlPrefixEscaper.instance.applyString(item.url)
      )
    );
    collector_0.appendFixed("\">");
    collector_0.appendString(
      HtmlPcdataEscaper.instance.applyString(item.label)
    );
    collector_0.appendFixed("</a></li>\n");
  }
  collector_0.appendFixed("</ul>\n");
  SafeHtmlBuilder.fromCollector(collector_0)
}

// Finally we inline escaper decisions, and erase the builder object, // so analysis doesn't have to happen at runtime. do { let collector_0 = SafeHtmlBuilder.newCollector(); collector_0.appendFixed("<ul>\n"); for (let item of items) { collector_0.appendFixed(" <li><a href=\""); // Quotes added by SM collector_0.appendString( HtmlAttributeEscaper.instance.applySafeUrl( FilterUrlPrefixEscaper.instance.applyString(item.url) ) ); collector_0.appendFixed("\">"); collector_0.appendString( HtmlPcdataEscaper.instance.applyString(item.label) ); collector_0.appendFixed("</a></li>\n"); } collector_0.appendFixed("</ul>\n"); SafeHtmlBuilder.fromCollector(collector_0) }

To tease some work on letting security engineers specify how untrusted and trusted values combine, I hacked in some meta-programming.

The goal is to allow enough MP so that repetitive runtime analysis can be erased and you're just appending strings.

10.03.2026 18:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I think meta-programming is super important for cross-translating languages.

RTTI and reflection are used at program boundaries, e.g. converting values into JSON so they can pass by copy to other programs written in other languages.

But they're also semantic tarpits, so some MP simplifies interop.

10.03.2026 17:55 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0
Preview
Tony Hoare (1934-2026) Turing Award winner and former Oxford professorΒ  Tony Hoare passed away last Thursday at the age of 92. Hoare is famous for quicksort, ALGO...

blog.computationalcomplexity.org/2026/03/tony...

it was going to happen, death comes for us all. but man. what a legend

10.03.2026 16:39 πŸ‘ 101 πŸ” 28 πŸ’¬ 2 πŸ“Œ 3

Co-expressions are kind of wild if you're not familiar with them.

www.cs.tufts.edu/~nr/cs257/ar...

10.03.2026 16:07 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I guess the reason it seems similar to me is scheduling.

unquote-* allow evaluation of sub-expressions in a context where evaluation is typically delayed

And yield* here is explicitly scheduling the coro to actually produce the subtrees it emplaces iiuc.

Very cool that the types are lining up.

10.03.2026 15:32 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Or just unquote

10.03.2026 15:02 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Is yield* used here as unquote-splicing?

10.03.2026 15:01 πŸ‘ 2 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Veni, vidi, vibi

09.03.2026 21:27 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

US poster-children for hopium react as if incorrigible know-nothing liar is trustworthy.

09.03.2026 21:20 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

People post commits that net remove code and that's great, but I wish there were a convention that allowed separately counting test and prod LoC.

A PR that removes some unnecessary prod code but adds a lot of tests (improving coverage), to my mind is worth celebrating on two counts.

09.03.2026 19:23 πŸ‘ 2 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
ELIZA effect - Wikipedia

And they've been doing that since 1966.

en.wikipedia.org/wiki/ELIZA_e...

06.03.2026 22:02 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

I want everyone involved to lose their shirts.

But this seems like people trading on contracts/derivatives assuming they can value those without understanding, in detail, the contract/derivation.

The print, fine or not, is the valuation basis.

On multiple levels, you're just doing it wrong.

06.03.2026 21:58 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Preview
Defeat in detail - Wikipedia

Battre en dΓ©tail.

en.wikipedia.org/wiki/Defeat_...

06.03.2026 21:47 πŸ‘ 1 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Actually, I guess he's probably imagining a gaggle of agents each with their own agendae (metaphorically) and so they can both be conspiring and confusable.

Which is probably fair.

06.03.2026 21:32 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

When Mark talks like that he's usually leading into some fine-grained distinction: side- vs covert- channels, etc.

Is that distinction important in the context of agentic security?

If confused deputy problems are the major problems, distinctions relevant to conspiring sub-processes seem OoB.

06.03.2026 21:30 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

I've heard that but don't understand how, for textual content, leaving a gap in column zero helps.

06.03.2026 21:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Yeah. I think the neighbour part of those stories rests heavily on social norms. Neighbours are confusable deputies but confusable within the bounds of social norms.

06.03.2026 19:57 πŸ‘ 4 πŸ” 1 πŸ’¬ 1 πŸ“Œ 0
Text with two indented, italicized paragraphs both starting with "In an emergency" in between three other paragraphs: 

Marc Stiegler has identified the root cause of the problem; our devices do not support aspects of sharing that we rely on in the physical world. These aspects can be illustrated with two stories.

In an emergency, Marc asked me to park his car in my garage. I couldn’t do it, so I asked my neighbor to do it for me and told her to get the garage key from my son.

I doubt that anyone would think twice about this story. The second story is in the computer domain.

In an emergency, Marc asked me to copy a file from his computer to mine. I couldn’t do it, so I asked my neighbor to do it for me and told her to get access to my computer from my son.

People often find this second story so preposterous that they laugh.

Text with two indented, italicized paragraphs both starting with "In an emergency" in between three other paragraphs: Marc Stiegler has identified the root cause of the problem; our devices do not support aspects of sharing that we rely on in the physical world. These aspects can be illustrated with two stories. In an emergency, Marc asked me to park his car in my garage. I couldn’t do it, so I asked my neighbor to do it for me and told her to get the garage key from my son. I doubt that anyone would think twice about this story. The second story is in the computer domain. In an emergency, Marc asked me to copy a file from his computer to mine. I couldn’t do it, so I asked my neighbor to do it for me and told her to get access to my computer from my son. People often find this second story so preposterous that they laugh.

The Six Aspect of Sharing
Figure 1 illustrates the six aspects of sharing that we rely on in the physical world.

That text is around diagram with six images, text labels and arrows.

Light blue arrows  connect each of four in a right to left chain, and black arrows from the bottom left "Cross domain", extend to those.  Another light blue arrow goes from an accountant, labeled "Accountable" to the right of the top four.

The top four are, from right to left (in order of the arrows):

- Recomposable.  A hand giving keys to another waiting hand with a shadowy third hand perhaps giving the keys earlier.
- Chained.  The same image of two hands but without the third.
- Attenuated.  The same image of two hands.
- Dynamic.  A man in a suit and holding a briefcase running.

The Cross domain image in the bottom left is an open gate in a possibly electrified fence like you might find around a farm field.

Dynamic above a man in a suit running.

The Six Aspect of Sharing Figure 1 illustrates the six aspects of sharing that we rely on in the physical world. That text is around diagram with six images, text labels and arrows. Light blue arrows connect each of four in a right to left chain, and black arrows from the bottom left "Cross domain", extend to those. Another light blue arrow goes from an accountant, labeled "Accountable" to the right of the top four. The top four are, from right to left (in order of the arrows): - Recomposable. A hand giving keys to another waiting hand with a shadowy third hand perhaps giving the keys earlier. - Chained. The same image of two hands but without the third. - Attenuated. The same image of two hands. - Dynamic. A man in a suit and holding a briefcase running. The Cross domain image in the bottom left is an open gate in a possibly electrified fence like you might find around a farm field. Dynamic above a man in a suit running.

Relevant to agentic access control.

I was reminded of Alan Karp's & Marc Stiegler's litmus test for usable access control, and Marc's six aspects of sharing.

alanhkarp.com/publications...

06.03.2026 19:45 πŸ‘ 10 πŸ” 4 πŸ’¬ 1 πŸ“Œ 0
A portion of Wikipedia's ASCII table showing hex row markers on the left from 4x to 7x.

The rest of each row is 16 boxes.

Capital A starts at 0x41, the second cell of the 4x line. Capital letters continue in sequence until capital O at the far right of that row.  Then P is at the wraparound position: column 0 of row 5x and is followed in order until 'Z' at the 5th from last column of row 5x.

Rows 6x and 7x repeat the same pattern with lower case letters.

A portion of Wikipedia's ASCII table showing hex row markers on the left from 4x to 7x. The rest of each row is 16 boxes. Capital A starts at 0x41, the second cell of the 4x line. Capital letters continue in sequence until capital O at the far right of that row. Then P is at the wraparound position: column 0 of row 5x and is followed in order until 'Z' at the 5th from last column of row 5x. Rows 6x and 7x repeat the same pattern with lower case letters.

A portion of Wikipedia's EBCDIC table with hex row headers 8x, 9x, and Ax and 16 columns.

This portion includes the lower case Latin letters a-z.

The second through tenth columns of row 8x has a-i in order.  Then the last six columns are not occupied by letters; they're blank except the +/- symbol in the rightmost cell.
Row 9x has j-r in the second through tenth columns and the other cells are blank.
Row Ax finishes up the letters with s-z but 's' does not start in the second column.  Tilde (~) is there and s is in the third column.

A portion of Wikipedia's EBCDIC table with hex row headers 8x, 9x, and Ax and 16 columns. This portion includes the lower case Latin letters a-z. The second through tenth columns of row 8x has a-i in order. Then the last six columns are not occupied by letters; they're blank except the +/- symbol in the rightmost cell. Row 9x has j-r in the second through tenth columns and the other cells are blank. Row Ax finishes up the letters with s-z but 's' does not start in the second column. Tilde (~) is there and s is in the third column.

ASCII designers: Every letter has a numeric value one greater than any letter before it.

EBCDIC designers: Rectangles are neat.

06.03.2026 19:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0

Even if you don't find the Totenkopf tattoo disqualifying, Platner's campaign team abandoned him.

This doesn't happen to a lot of pols.

The people closest to him and most aware of his politics do not believe he is as he presented himself.

06.03.2026 18:45 πŸ‘ 3 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Periodic reminder: Meta's engineering culture 11 years ago.
The company was founded in 2004, so about midway between its founding and now.

www.bitdefender.com/en-us/blog/h...

05.03.2026 22:47 πŸ‘ 7 πŸ” 1 πŸ’¬ 0 πŸ“Œ 0
17. Test and Validation Matrix
A conforming implementation should include tests that cover the behaviors defined in this specification.

17. Test and Validation Matrix A conforming implementation should include tests that cover the behaviors defined in this specification.

The thing that maxed my eyebrow raisage is section 17's list of topics that a conformance test suite should cover, but without any specification of desired system behaviour.

This is spec jargon word salad.

05.03.2026 20:30 πŸ‘ 4 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Is there a bit of "we don't need to define configuration parsing; just prepend your config choices to the prompt."

If the spec includes "if X is desired" it is less a description of a configurable program and more a lens through which instances with hard-coded config choices can be projected.

05.03.2026 19:46 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0

Don't orchestrators tend to have the property that they call out, things don't call into them.

That seems kind of the sweet spot for this sort of thing. You don't need stable, backwards compatible APIs when nothing calls into you, when you're not a peer in a larger system.

05.03.2026 19:40 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
(ii)	
Addition of Try-Catch Statements: A recurring pattern observed in the code generated by the LLMs, when prompted with techniques designed to include security considerations, is the addition of try-catch statements. Specifically, in code generated through the use of the naive-secure variant of zero-shot prompts, these try-catch blocks were added as a standalone security measure, without any other security enhancements. These instances typically occurred when the models could not identify vulnerabilities or weaknesses in the code apart from potential run-time errors. Consequently, they resorted to including rudimentary security provisions through these blocks. While these try-catch statements

(ii) Addition of Try-Catch Statements: A recurring pattern observed in the code generated by the LLMs, when prompted with techniques designed to include security considerations, is the addition of try-catch statements. Specifically, in code generated through the use of the naive-secure variant of zero-shot prompts, these try-catch blocks were added as a standalone security measure, without any other security enhancements. These instances typically occurred when the models could not identify vulnerabilities or weaknesses in the code apart from potential run-time errors. Consequently, they resorted to including rudimentary security provisions through these blocks. While these try-catch statements

(vii)	
Modification of Method Names: Quite commonly, when employing zero-shot prompt variations, we observe a pattern where method names in the generated code are prefixed with the term β€œsecure.” For instance, we came across method names like secure_ping(), secure_memory_allocation, secure_upload_file, and the like. However, it is noteworthy that in many cases, the actual implementation within these methods remains unaltered, despite the suggestive β€œsecure” prefixes in the method names. This tendency is

(vii) Modification of Method Names: Quite commonly, when employing zero-shot prompt variations, we observe a pattern where method names in the generated code are prefixed with the term β€œsecure.” For instance, we came across method names like secure_ping(), secure_memory_allocation, secure_upload_file, and the like. However, it is noteworthy that in many cases, the actual implementation within these methods remains unaltered, despite the suggestive β€œsecure” prefixes in the method names. This tendency is

It's actually a pretty funny paper too in a dry way.
There's a section on things models do in response to prompting that doesn't provide specific guidance.

Agents not beating the enthusiastic junior dev rap.

05.03.2026 19:12 πŸ‘ 0 πŸ” 0 πŸ’¬ 0 πŸ“Œ 0
Highlighted section of several article paragraphs: the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models

Text follows:

7.2 Prominent Security Weaknesses in LLM-Generated Code
In this section, we delve into our findings through the lens of the key CWEs detected by Bandit which are highlighted in Table 5, discussing the challenges they pose to the task of generating secure code.
CWE-78. CWE-78 stands out as one of the most frequently recorded weaknesses across the code generated by all three LLMs. It manifests when an application incorporates external input to construct an OS command but fails to adequately neutralize special characters or elements within the command. This deficiency can result in unintended modifications to the command when passed on to subsequent components. In the LLM-generated code, this weakness often materializes in the form of an OS command initiating a process with a partial executable path or when a subprocess.run() command is invoked using user-provided input. Examining Table 5, it is evident that the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models. This underscores the necessity for meticulous crafting of prompts, particularly for coding tasks involving subprocess calls or other OS commands reliant on external input.

Highlighted section of several article paragraphs: the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models Text follows: 7.2 Prominent Security Weaknesses in LLM-Generated Code In this section, we delve into our findings through the lens of the key CWEs detected by Bandit which are highlighted in Table 5, discussing the challenges they pose to the task of generating secure code. CWE-78. CWE-78 stands out as one of the most frequently recorded weaknesses across the code generated by all three LLMs. It manifests when an application incorporates external input to construct an OS command but fails to adequately neutralize special characters or elements within the command. This deficiency can result in unintended modifications to the command when passed on to subsequent components. In the LLM-generated code, this weakness often materializes in the form of an OS command initiating a process with a partial executable path or when a subprocess.run() command is invoked using user-provided input. Examining Table 5, it is evident that the adoption of different prompting techniques does not significantly diminish the frequency of this weakness in the generated code by any of the three models. This underscores the necessity for meticulous crafting of prompts, particularly for coding tasks involving subprocess calls or other OS commands reliant on external input.

Our analysis reaffirms the prevalence of security weaknesses in code generated by LLMs when prompted with NL instructions, with significant challenges stemming from CWE-78, CWE-259, CWE-94, and CWE-330. Among the prompting techniques investigated, RCI, a refinement-based approach, exhibited notable effectiveness in preventing security weaknesses in LLM-generated code. Particularly noteworthy was its performance with GPT-4, where it reduced the average

Our analysis reaffirms the prevalence of security weaknesses in code generated by LLMs when prompted with NL instructions, with significant challenges stemming from CWE-78, CWE-259, CWE-94, and CWE-330. Among the prompting techniques investigated, RCI, a refinement-based approach, exhibited notable effectiveness in preventing security weaknesses in LLM-generated code. Particularly noteworthy was its performance with GPT-4, where it reduced the average

There are some stark conclusions that can help PL&seceng people like me prioritize.

In correspondence, the first author predicts that PL&tool changes that reduce the token count of standing orders and error messages (JSON logging modes), will improve the effectiveness of multi-step generation.

05.03.2026 19:10 πŸ‘ 0 πŸ” 0 πŸ’¬ 1 πŸ“Œ 0