Recently, the news has been buzzing with reports that Google is in big trouble. The reason for this is that the media has broken the news that part of Google’s training data comes from ChatGPT. This revelation has sent shockwaves through the tech world and has led many to question the ethical practices of Google and other major tech companies.
A former employee of Google and a top researcher who has switched to OpenAI revealed that Bard was trained with ChatGPT data. Hence, if accurate, this might be Google’s biggest scandal. Ironically, ChatGPT’s training data is used by the AI that is against ChatGPT.
As we all know that Microsoft has an exclusive license to use ChatGPT for commercial purposes, so Google is likely to sue. The one million dollars question is.
Did Google Bard Plagiarize ChatGPT?
The name Jacob Devlin can be described as thunderous. Devlin is one of the papers on the BERT model published by Google in 2018. This study was the catalyst for an increase in academic AI research. And it may be stated that Devlin’s research provided a strong foundation for the language models used by Google and OpenAI.
The Information claims that one of the reasons Devlin quit Google was because he learned that Bard, the seed player that Google used to compete against ChatGPT was training using ChatGPT data. He resigned from Google’s CEO Pichai and other executives that the Bard team was receiving training from ShareGPT.
One of the persons involved and the creator of ShareGPT, Steven Tey, claimed to have known about this for a very long time and that it had already spread throughout Google, where many employees were quite unhappy and concerned. Subsequently, he posted again, saying, now, the cat may have escaped from the pocket – referring to unintentionally leaking secrets.
Insiders claim that Google quickly ceased using this data to train Bard once Devlin gave a warning. Nevertheless, when Google spokesman Chris Pappas was questioned by foreign media outlet The Verge about the incident, he refuted it. He was positive that none of the data from ShareGPT or ChatGPT was used in Bard’s training.
It’s interesting to note that OpenAI has previously been the subject of debate, with numerous websites and artists accusing ChatGPT of stealing information from itself. And today marks the first time that a different company has been charged with stealing data from ChatGPT.
Google’s Use of ChatGPT Data Puts Ethics in Question
Before this, the well-liked ChatGPT was incorporated into Bing, which already decreased Google’s stock price. Google is frantically attempting to catch up after its position as the industry leader in the search was challenged. Bard erred during the press conference because of this, making Google look foolish and losing $100 billion in market worth.
Bard finally opened its doors after more than a month of secrecy. Everyone who used it discovered that Bard’s accuracy rate is not high and that it is not very effective at writing scripts. It’s longer than ChatGPT, which is stretched.
Since Google depends so heavily on ShareGPT, Devlin’s decision to leave Google and join OpenAI is even more significant because the master can join OpenAI directly without the need for a mediator. The fact that Microsoft holds an exclusive license to utilize ChatGPT for commercial purposes is even more concerning.
OpenAI is not blameless. The information used to train ChatGPT is also taken from the open Internet. In actuality, human creators including authors, videographers, and artists have not been granted permission for these contents.
But Microsoft recently disclosed various incentive schemes to reward content producers for their contributions to Bing Chat responses. Yet, whether it’s Google or Microsoft, at the end of the day, all that matters is shareholder profit.
Final Thought
It is not intrinsically unethical to use chatbot-generated data to train AI algorithms. But it needs to be done with the right approval and oversight. Therefore, companies must be open and honest about the sources and procedures utilized to gather their data. Companies must also guarantee that human users’ autonomy and privacy are respected.
All in all, there are significant ethical concerns with using ChatGPT data to train Google’s most recent language model. High transparency and ethical standards must be applied to the AI industry.
Thus, firms must make sure that the proper consent and oversight are in place before using chatbot-generated data for commercial reasons. Failing to do so jeopardizes the autonomy and privacy of human users and taints the reputation of the AI sector as a whole.