Attributing AI content

donkjr · August 14, 2023, 1:51pm

A recent thread and my use of AI-generated content caused me to wonder:

“What is the proper way to attribute content to an AI tool”

So I asked Bard and CHAT GPT

Note: I added two AI:[xx] “canned replies” that match the below.

FROM BARD:
The proper way to attribute content to an AI tool depends on the style guide you are using. However, there are some general guidelines that you can follow.

Identify the AI tool. In your text, identify the AI tool that generated the content. You can do this by name, such as “Bard” or “GPT-3.”
Provide the date. Include the date when the content was generated. This can help your readers to understand the context of the content and to determine its relevance.
Provide a link to the AI tool. If possible, provide a link to the AI tool so that your readers can learn more about it.
Acknowledge your role. If you edited or revised the content generated by the AI tool, acknowledge your role in the process. This is especially important if you made significant changes to the content.

Here is an example of how you might attribute content to an AI tool in APA style:

“This text was generated by Bard on August 14, 2023. Bard is a large language model from Google AI, trained on a massive dataset of text and code. The text has been edited by [your name] to improve clarity and grammar.”

FROM CHAT GPT
When attributing content to an AI tool like GPT-3.5, it’s important to provide accurate and transparent information about the source of the content. Here’s a general guideline for properly attributing content to an AI tool:

Mention the AI tool: Clearly state that the content was generated with the assistance of an AI tool, specifying the tool’s name, version, or model. For example, you can mention “Generated using GPT-3.5 by OpenAI.”
Include a disclaimer: Add a disclaimer to clarify that the content was created by an AI and may not represent the views or opinions of a specific individual or organization. This helps readers understand that the content is algorithmically generated.
Provide context: If necessary, provide context about the AI’s capabilities and limitations. This can help readers understand the context in which the content was generated.
Transparency about human involvement: If a human-provided input, guidance, or editing to the AI-generated content, you should mention that as well. This helps maintain transparency about the content creation process.
Publication date: Include the date when the content was generated. This helps readers understand the timeframe of the AI’s knowledge.
Cite sources: If the AI-generated content incorporates information from specific sources, provide proper citations for those sources as you would with any other type of content.
Credit the platform: If the AI tool is accessed through a specific platform or software, you might want to mention the platform’s name or link to it.

Here’s an example of how you could attribute content to an AI tool:

>This content was generated using GPT-3.5, an artificial intelligence language model developed by OpenAI. Please note that this text was algorithmically generated and may not reflect the opinions or views of any individual or organization. Human input and editing may have been applied to enhance clarity and coherence. The content was created on [Date]. For more information about GPT-3.5 and its capabilities, visit [Link to OpenAI's website].

NOTE: I have no idea how the above slider was created!!!

Note: I refrained from attributing the attribution answer

mcdanlj · August 14, 2023, 8:53pm

How about asking the AI tool to cite its sources, and validate the citations because they are known to make up plausible URLs?

Them asking to be attributed without citing their own sources is richly ironic, and not in a good way…

donkjr · August 15, 2023, 9:56am

My intention with the attribution was to let the reader know where the content came from, less about how accurate it is. i.e. I did not generate the content.

mcdanlj · August 15, 2023, 11:30am

I’m salty at them, not you.

LLMs may cite some sources, but as the ChatGPT citation suggestion indicates, you should validate its citations for it because it isn’t capable of validation, and it’s known to just make up plausible URLs. This is, to the best of my knowledge, an intrinsic problem with how LLMs work.

And then they have the temerity to request formal citation of their summaries. They want the content of your citation to to include advertising on their behalf (e.g. “trained on a massive dataset of text and code”).

I think that if they want APA or similar citations, they should also hold themselves to the same standard. Sauce for the goose is sauce for the gander.

I agree that it makes sense to label the content. Regarding “where the content came from” that’s kind of tricky, given that LLMs are bad at divulging that. I do think it’s actually important to label it. It’s not that people are reliable either! Sometimes the “tells” for unreliability are similar. But sometimes they are not, and so far we’ve seen evidence of smart people who don’t know the limits of the technology.

mcdanlj · August 15, 2023, 12:07pm

Incidentally, one reason they might care about citing them is as negative feedback for the training loop, to try to ward off model collapse. If LLM-generated text is cited, they might be able to filter it out of any updated training corpus.

Jammy · August 19, 2023, 9:14pm

As a professional games developer, and a programmer for 30 years (yikes!) - I love AI (specifically ChatGPT).

When you are worn down by the constant change in technology, and requests to "just make something " in another language, that you cant keep up with the syntax anymore (you can code - make a website!) - its proven to be invaluable.

Ive found it doesnt just “write code”, but it gives you insightful direction. You need some knowledge to get things to work.

Not sure how it will all end, but it’s the best tool I’ve used in my career.

donkjr · August 19, 2023, 10:13pm

I know!

I found use for it in:
-Coding Arduino stuff
-Power Automate flows
-Mechanical, electrical and magnetic calculations
-Fly fishing recommendations: specific flys for specific rivers at specific times of the year.
-Poems & song creation
-Recommendations for answering to controversial social input. Keeps me grounded.

If your old enough you will remember when the electronic calculator obsoleted the slide rule we were all predicted to systematically loose our math skills.

mcdanlj · August 23, 2023, 12:04am

I’d like to be clear that I’m not disputing that LLMs can be powerful tools. It’s that as they built these powerful tools, those responsible for building them didn’t also invent better capabilities for identifying relevant sources from which they draw. It feels cheap to me specifically for them to request citation while doing terrible job of doing the same thing. This wouldn’t matter if they weren’t powerful tools. (If no one had a reason to be in the forest to hear the tree fall, it wouldn’t matter whether it makes a sound.)

This lack of reciprocity makes me not take their request for a particular form of attribution very seriously. At the same time, I agree wholeheartedly that it is important to indicate material that was generated by an LLM, most particularly if you haven’t validated it thoroughly.

Also, I wonder if one of the reasons that they want citations is that the outputs look like they aren’t subject to copyright?

Oscar · September 5, 2023, 12:50am

I just started playing the SDXL 1.0 for cnc pattern creation. Not perfect but a ton of potential. For personal use these would be a lot fun to play with.

Did add the citation so good to know. lol