Follow Slashdot stories on Twitter

 



Forgot your password?
typodupeerror
×
AI Government

NYC's Government Chatbot Is Lying About City Laws and Regulations (arstechnica.com) 57

An anonymous reader quotes a report from Ars Technica: NYC's "MyCity" ChatBot was rolled out as a "pilot" program last October. The announcement touted the ChatBot as a way for business owners to "save ... time and money by instantly providing them with actionable and trusted information from more than 2,000 NYC Business web pages and articles on topics such as compliance with codes and regulations, available business incentives, and best practices to avoid violations and fines." But a new report from The Markup and local nonprofit news site The City found the MyCity chatbot giving dangerously wrong information about some pretty basic city policies. To cite just one example, the bot said that NYC buildings "are not required to accept Section 8 vouchers," when an NYC government info page says clearly that Section 8 housing subsidies are one of many lawful sources of income that landlords are required to accept without discrimination. The Markup also received incorrect information in response to chatbot queries regarding worker pay and work hour regulations, as well as industry-specific information like funeral home pricing. Further testing from BlueSky user Kathryn Tewson shows the MyCity chatbot giving some dangerously wrong answers regarding treatment of workplace whistleblowers, as well as some hilariously bad answers regarding the need to pay rent.

MyCity's Microsoft Azure-powered chatbot uses a complex process of statistical associations across millions of tokens to essentially guess at the most likely next word in any given sequence, without any real understanding of the underlying information being conveyed. That can cause problems when a single factual answer to a question might not be reflected precisely in the training data. In fact, The Markup said that at least one of its tests resulted in the correct answer on the same query about accepting Section 8 housing vouchers (even as "ten separate Markup staffers" got the incorrect answer when repeating the same question). The MyCity Chatbot -- which is prominently labeled as a "Beta" product -- does tell users who bother to read the warnings that it "may occasionally produce incorrect, harmful or biased content" and that users should "not rely on its responses as a substitute for professional advice." But the page also states front and center that it is "trained to provide you official NYC Business information" and is being sold as a way "to help business owners navigate government."
NYC Office of Technology and Innovation Spokesperson Leslie Brown told The Markup that the bot "has already provided thousands of people with timely, accurate answers" and that "we will continue to focus on upgrading this tool so that we can better support small businesses across the city."
This discussion has been archived. No new comments can be posted.

NYC's Government Chatbot Is Lying About City Laws and Regulations

Comments Filter:
  • by databasecowgirl ( 5241735 ) on Friday March 29, 2024 @04:43PM (#64354644)
    Leslie Brown told The Markup that the bot "has already provided thousands of people with timely, accurate answers"

    This information didn't happen to be provided by the bot?
  • by VampireByte ( 447578 ) on Friday March 29, 2024 @04:49PM (#64354662) Homepage

    Since the bot tells lies it would make the perfect politician.

  • by SuperKendall ( 25149 ) on Friday March 29, 2024 @05:00PM (#64354686)

    All of these boomers say we need to limit AI research because AI will "kill us all".

    But here we find the real danger - people who wantonly deploy AI without realizing the limits it has, including hallucination of answers, AI as we know it today, should never exist in the role they have placed it in, given how it can simply make up information.

    Instead some kind of advanced search engine should hewn been applied to look up online docs, or else the AI should have been heavily constrained to have to point to origin sources with a secondary system deciding if the origin source agreed with what it said.

    Until we are anywhere close to eliminating hallucination we must allow for open ongoing research, and be more cautious about rolling out AI in positions of public trust.

    • LLM's don't hallucinate. They malfunction.

      • They malfunction but I would say they are especially prone to malfunctioning in a way that people have termed "hallucination", where the results have made-up facts.

        • by StormReaver ( 59959 ) on Friday March 29, 2024 @07:24PM (#64354980)

          ...in a way that people have termed "hallucination"....

          I agree with you, and my comment wasn't directed at you. I've seen so many people think that LLM's have some kind of intelligence (and therefore sentience) that I am actively fighting the use of anything that might suggest these programs are anything other than pattern matching algorithms.

          • I am actively fighting the use of anything that might suggest these programs are anything other than pattern matching algorithms

            I see, have to agree that is a really great point. I will stop using that term myself as I agree with your overall goal.

            I think I would maybe say, malfunctioning with sometimes backwards output.

      • I'd describe it as being incorrectly engineered.
    • Pretty sure unless there's some super smart humans in the loop, there's bugger all chance AI has of gaining in depth, domain specific knowledge. Remember how hard it was to get computer chess and go to beat decent human opposition. And the domain knowledge for those two fakes is minuscule compared to something like tax law.
    • All of these boomers which have given me everything good I have in the world and a safe place to indulge myself in insulting them say we need to limit AI research because AI will "kill us all".

      FTFY

      But here we find the real danger - people who wantonly deploy AI without realizing the limits it has,

      Which means everyone, including you.

      That's one problem, but hardly the only one. The "training" materials are effectively a closed loop of reinfocing the correcteness or flaws of the

      • All of these boomers which have given me everything good I have in the world and a safe place to indulge myself

        I can understand your offense at this, but please know I myself am a boomer as well, and that what happened there is I wrote "doomer" and auto-correct altered it for me.

        Doomers have never given us anything good, only fear.

        The "training" materials are effectively a closed loop of reinfocing the correcteness or flaws of the training materials,

        I totally agree with this, in fact I think general LLMs w

        • by cstacy ( 534252 )

          All of these boomers which have given me everything good I have in the world and a safe place to indulge myself

          I can understand your offense at this, but please know I myself am a boomer as well, and that what happened there is I wrote "doomer" and auto-correct altered it for me.

          Doomers have never given us anything good, only fear.

          The "training" materials are effectively a closed loop of reinfocing the correcteness or flaws of the training materials,

          I totally agree with this, in fact I think general LLMs will get much worse over time because of this.

          However domain specific LLM's carefully trained not on their own bullshit, so to speak, might offer quick insights into a body of understanding.

          Along with totally wrong and misleading "hallucinated" stuff mix in, in an undetectable way, so that the user is fooled.

    • by Orgasmatron ( 8103 ) on Friday March 29, 2024 @10:46PM (#64355350)

      The problem fundamentally is that the LLM doesn't "know" anything. It strings words together in a way that is statistically similar to the way a human would do it, according to training data.

      But they don't know what the words mean. They don't know what the sentences mean. They don't know the difference between true and false, right and wrong, real and fake.

      They are fluid bullshit machines.

    • Advanced search engines will not help you. The AIs are already writing junk into the very web pages that your search engine will read. This will not get better until we humans start creating and enforcing information pollution laws.
      • Advanced search engines will not help you. The AIs are already writing junk into the very web pages that your search engine will read.

        I'm thinking of advanced search engines that are only indexing real documents you have written by humans, many places for example that offer SDKs have extensive documentation that today at least, was all written by humans... so a classic search engine that only referenced that catalog for example would still be reliable.

        Otherwise I agree with you, and in fact if you think abo

        • Yup, I've used bing rarely over the years but I'm using it more now. Don't like the layout with all the junk, but it does give better results than google
    • All we need to do is keep the bots to their word. If a company or government implements it and it gives an answer that is not in favor of that entity, it must nevertheless accept it. Whether that is a discount or an ordinance, this bot has now held that certain regulations are invalid, thatâ(TM)s great, keep it in place.

  • AI chatbots will give excellent customer service
    Unfortunately, today people fall for the hype and put crappy, preliminary prototypes into service

  • Geat job! (Score:5, Insightful)

    by gweihir ( 88907 ) on Friday March 29, 2024 @05:18PM (#64354726)

    Natural morons installing an artificial moron.

    • by zooblethorpe ( 686757 ) on Friday March 29, 2024 @06:02PM (#64354818)

      "Natural morons installing an artificial moron."

      I often find myself thinking that "AI" anymore stands for "Artificial Ignorance".

      • How about incompetence? Because an AI "knows" tons of things in the sense that the information is in there somewhere, but it doesn't know anything in the sense of being able to project the data into rational information.

        • by gweihir ( 88907 )

          Well, if it was just incompetent, that would not be so bad. But instead it has delusions of being great at things and starts to hallucinate far too often.

          I mean, a lexicon is "incompetent" (cannot do anything), but really useful. A version where a bit more obscure articles are often pure fabrications loses that usefulness.

      • by gweihir ( 88907 )

        Artificial Idiot is also pretty fitting. Also takes into account that an Artificial Idiot may be able to take the job of a natural idiot and do it cheaper.

      • "Natural morons installing an artificial moron."

        I often find myself thinking that "AI" anymore stands for "Artificial Ignorance".

        I myself prefer “Autocomplete Insanity”

      • Well yeah - the name is kind of the problem, and for my money, the whole "because people are stupid" meme does not apply here.

        The tech industry put the word "Intelligence" right in there in the name and now we're acting surprised that people assume some relationship to the actual meaning of the word?

        If a company sold contracts for something called "Car Insurance" and it turned out to be time-shares, would we be sitting around joking about how people are too dumb to understand legal contracts, or would we th

  • Lawsuits (Score:5, Insightful)

    by CAIMLAS ( 41445 ) on Friday March 29, 2024 @05:22PM (#64354732)

    We've yet to even really begin to see the lawsuits resulting from the wide-scale adoption of LLMs like this.

    Everyone thinks jobs will be replaced en masse and that AI will pose an existential threat for humanity, but I think those theories are vastly, wildly speculative at this point - and definitely not anything that we'll see in practice in the next 5, 10 years. Why?

    They're only looking at a very small subset of utility and basing their assumptions solely on the technology, not on how it will be used or the assumptions people will make based on it.

    You all saw the NVIDIA "AI nurses" bit recently, I'm sure. Why nurses, and not doctors? Well, nurses don't practice medicine, and doctors do. Nurses can't be sued for malpractice, and do not need malpractice insurance.

    Now imagine for a second the lawsuits that would result from tens of thousands of people getting bad medical advice, even harmful medical advice, due to hallucinations. Imagine the lawsuits from discrimination against minorities, or some other bias which was programmed in. A poorly trained workforce which makes mistakes in reading and comprehension of the laws, rules, etc. is one thing - you'll have a spread of abilities across all employees - but a singular AI with the same biases writ large is another.

    Now remove the filter of experience from these chatbots and other tools I'm sure they'll try to create, and you start to see a broader problem: AI controlled military drones which, maybe 5% of the time, intentionally target civilians. AI which will hallucinate the wrong license plate and send a ticket to someone unrelated. People getting told by an automatic "nurse" a cancer prognosis with admonitions that it's not a big deal and they don't need further care. And so on...

    Those liabilities will sink any company relying on the technology extremely quickly.

    LLMs have a long way to go before they're more than a hype gimmick for broad adoption. We'll find general utility in them to improve our own workflows in the near to immediate term (1-5 years), but we may find a limit to techniques and models which make broad adoption impossible at a societal scale.

    • We've yet to even really begin to see the lawsuits resulting from the wide-scale adoption of LLMs like this.

      Well if the lawyers would just use LLM for all their case work this lawsuit would get moving a lot faster.

    • Remember when they thought it would be easy to make cars drive themselves? Wasnt that supposed to happen by 2020?
    • by sjames ( 1099 )

      So far, only the legal profession has seen strong liability for uncritical use of AI. For example, a lawyer had an AI write a brief supporting his client's position and submitted it to the court. When the judge found that the cited case law was fictional (a hallucination), the lawyer was summoned for a show cause hearing to decide if he would be allowed to continue practicing law.

      • by CAIMLAS ( 41445 )

        Right, and that's half the point.

        We haven't seen general applicability of LLMs yet because they're not very good. The results are (often) obviously wrong, or wrong often enough that "AI as a service" isn't tenable.

        But "AI as a service" - doctors, nurses, lawyers, mechanic diagnosticians, etc. - is the objective here. They want a turnkey solution that you can turn to as an expert source of truth. That's both the holy grail, and the only commercially viable outcome which won't lead to lawsuits and mass disill

  • I thought I'd be able to get some hilarious answers by asking absurd questions, but each time it just responded with a generic "I'm sorry Dave, I'm afraid I can't do that" response. Now I'll never know the answers to my important burning questions about NYC, such as:

    If a homeless person steals my Cheetos, am I obligated to buy him a Tesla model 3?
    Can I fly a drone inside a cardboard box?
    If my business runs out of beer, will I have to climb on the roof and sing an oompa loompa song about the dangers of alco

  • AI powered chabots are flawed by design because they don't know that they don't know. How can one be stupid enough to use a chatbot without knowing that?

    • "AI powered chabots are flawed by design because they don't know that they don't know. How can one be stupid enough to use a chatbot without knowing that?"

      It's the Kunning-Druger effect. :D

    • The solution is don't assume the chat bot knows anything. Any prompt should be treated as a query against the primary law text. The one and only valid result is a citation of that text. The bot can summarize it and relate it back, but that citation has to be a bottleneck in the system.
      • by cstacy ( 534252 )

        The solution is don't assume the chat bot knows anything. Any prompt should be treated as a query against the primary law text. The one and only valid result is a citation of that text. The bot can summarize it and relate it back, but that citation has to be a bottleneck in the system.

        This is impossible, because there is no connection between the law (original input), and the output from the chatbot. The output is the input, run through a word blender. There is no place in the system where what the chatbot says can be related to the input. There is no place to put a citation or reference or anything like that. All an LLM does is make word salad out of the input, with no meaning at all, and statistically you hope it comes out saying something similar to the original input. It does NOT kno

        • You're right but you are also a bit wrong. Let me explain. The LLM model does not contain factual information, but it does capture the structure of language. Ask it to discuss a subject, and you'll get the equivalent of someone paraphrasing what they overheard at Starbucks. The information might be right but you shouldn't count on it. But there is a difference between domain knowledge and language knowledge. The model does a pretty good job at language. So if you ask it a question about a document, or to su
          • The problem is that people want to ask it questions like "explain the novel war and peace" and with one paragraph they want to pass off as someone who actually read the book and understand it. But it's a phallacy from the beginning that any single paragraph could ever give you that much insight. Its why people are dubious that it could actually steal jobs.

            It is making people feel like the insights are within reach, but nothing is ever going to replace reading the book because it just takes a lot of Engli
          • by cstacy ( 534252 )

            You're right but you are also a bit wrong. Let me explain. The LLM model does not contain factual information, but it does capture the structure of language.

            It's just doing statistical auto-complete on word fragments. It's a pile of shit.

  • The page linked at the end of the first paragraph really is hilarious. https://bsky.app/profile/kathr... [bsky.app]
  • If I tried my best to trick a real person on the phone to give me incorrect information would I be any less successful?
  • There is something seriously wrong with how this problem was tackled. The bot seems to have no cognizance of how cities are typically structured bureaucratically and does not recognize entities like "City Attorney Office" -- which in New York is called "New York City Law Department". ChatGPT however does seem to possess this understanding about New York City. It feels like they trained on just the city website contents without enough supporting documents about cities in general. Their fear of a porn hal
    • by cstacy ( 534252 )

      There is something seriously wrong with how this problem was tackled. The bot seems to have no cognizance of how cities are typically structured bureaucratically and does not recognize entities like "City Attorney Office" ...

      I hate to break it to you, but the bot does not know what any of the words mean. Not your prompt, not the output it generates, nothing. Ever.

      That is not how LLMs work.
      They do not contain facts.
      Just meaningless word fragments that it strings together without any possible regard for what anything means at any point.

      It would be possible, of course, to make a system (which I suppose you could call "AI' if that gives you tingles) that does what you propose here. Have a "knowledge" of what the pieces of the Gover

  • ... if ten percent for the big guy is still the going rate.

  • I suspect that the bot is simply providing the information it was given. If you scrape with interwebs pipes for information & then process it for the most statistically relevant points, it'll give you the de facto consensus, which isn't the same thing as canonical law. I bet there's a tonne of spurious advice from dodgy estate agents, lawyers, accountants, etc. that is just downright false & illegal.

    Basically, if you're looking for consensus & a reflection of what most people claim to be true
    • Then why would anyone pollute the answers that their bot gives by training it on opinions rather than fact? That just opens them to legal liability.
      • To build an LLM that produces reasonably coherent, cohesive texts requires a large corpus of language samples in order to calculate the statistical probabilities with a reasonable amount of statistical power. Large quantities of freely available text to build a corpus from aren't that easy to come by, & some notable GenAI companies have taken shortcuts, hence the current slew of lawsuits by media companies claiming copyright infringement. Where are they supposed to find that large a trove of texts? They
        • Ok well if they can't train them to be experts in a topic then it may qualify as AI because it forms English sentences but it's not an AI that you can use unless your job can be done by just a moderately informed stranger.
  • Because there are no means to punish it into learning NOT to lie. We punish kids when they lie to untrain that behavior and we punish adults who lie. Lying is a natural state. How are you going to punish a computer program?

  • Is that what the story is alleging?

  • by Mozai ( 3547 ) on Saturday March 30, 2024 @09:30AM (#64356172) Homepage

    Leslie Brown told The Markup that the bot "has already provided thousands of people with timely, accurate answers"

    When asked about the fallen debris, city engineers told reporters "we already made thousands of correct measurements..."

    When asked about the customers suffering from food poisoning, the kitchen owners told reporters "we've already provided thousands of satisfactory meals..."

    I get that mistakes will be made -- hell, the chatbot might even make fewer mistakes. What puts a bee in my bonnet is this attitude that nobody can be held accountable for hallucinations and errors made by the clockwork device you purchased hoping to avoiding giving a human a job.

You know you've landed gear-up when it takes full power to taxi.

Working...