One of the best things ever about LLMs is how you can give them absolute bullshit textual garbage and they can parse it with a huge level of accuracy.
Some random chunks of html tables, output a csv and convert those values from imperial to metric.
Fragments of a python script and ask it to finish the function and create a readme to explain the purpose of the function. And while it’s at it recreate the missing functions.
Copy paste of a multilingual website with tons of formatting and spelling errors. Ask it to fix it. Boom done.
Of course, the problem here is that developers can no longer clean their inputs as well and are encouraged to send that crappy input straight along to the LLM for processing.
There’s definitely going to be a whole new wave of injection style attacks where people figure out how to reverse engineer AI company magic.
One of the best things ever about LLMs is how you can give them absolute bullshit textual garbage and they can parse it with a huge level of accuracy.
Some random chunks of html tables, output a csv and convert those values from imperial to metric.
Fragments of a python script and ask it to finish the function and create a readme to explain the purpose of the function. And while it’s at it recreate the missing functions.
Copy paste of a multilingual website with tons of formatting and spelling errors. Ask it to fix it. Boom done.
Of course, the problem here is that developers can no longer clean their inputs as well and are encouraged to send that crappy input straight along to the LLM for processing.
There’s definitely going to be a whole new wave of injection style attacks where people figure out how to reverse engineer AI company magic.
Just use BeautifulSoup.