AI at school? Is it just there and do we have to deal with it?

Preliminary remarks for this english version

Alt­hough this artic­le takes a cri­ti­cal look at the use of AI in schools, I have used an AI tool „made in Ger­ma­ny“ for the trans­la­ti­on (https://www.deepl.com). it’s not my style and vio­la­tes my idea of flu­ent english.

On the basis of the­se thoughts, I some­ti­mes face strong cri­ti­cism from the Ger­man edu­ca­ti­on com­mu­ni­ty or even the edu­ca­ti­on admi­nis­tra­ti­on for which I work – it seems to me, that I would break (male) tech­nic toys or spoil them at least.

Introduction

Not a day goes by on social media wit­hout new, cool tips on using AI in the class­room. For three years now, I’ve been giving talks on AI to all kinds of groups and com­mit­tees, which has incre­asing­ly tur­ned into a very cri­ti­cal view of the topic.

1. AI applications that generate language prevent learning processes

Various rese­ar­chers and experts point to serious short­co­mings in lan­guage models, which form the back­bone of many edu­ca­tio­nal offe­rings. The effects on lear­ning pro­ces­ses are also being descri­bed with incre­asing cri­ti­cism. Signi­fi­cant­ly, the most nuan­ced cri­ti­cism almost always comes from peo­p­le with a back­ground in com­pu­ter sci­ence. Advo­ca­tes of the use of lan­guage models in the tea­ching con­text always argue that it always depends on the type of use. I am not con­vin­ced of this.

As an exam­p­le, I would like to refer to a recent stu­dy by Rai­ner Mühl­hoff and Mar­te Hen­ningsen, who took a clo­ser look at a Fobizz tool for the auto­ma­tic assess­ment of home­work. The­re are seve­ral of the­se tools or offe­rings on the Ger­man mar­ket, even tho­se that have recei­ved start-up awards. What they have in com­mon is that they are based on the same IT tech­no­lo­gy and are expli­cit­ly aimed at tea­chers. The study’s base of data is rela­tively small – unfort­u­na­te­ly, this is the case with many stu­dies in the edu­ca­ti­on sec­tor. Here are some excerp­ts from the results:

  1. Both the sug­gested over­all gra­de and the qua­li­ta­ti­ve feed­back varied signi­fi­cant­ly bet­ween dif­fe­rent assess­ment runs of the same sub­mis­si­on. This vola­ti­li­ty poses a serious pro­blem, as tea­chers rely­ing on the tool could unkno­wing­ly award ‘cher­ry-picked’ and poten­ti­al­ly unfair gra­des and feedback.§
  2. Even with full imple­men­ta­ti­on of the sug­ges­ti­ons for impro­ve­ment, it was not pos­si­ble to sub­mit a “per­fect” – i.e. no lon­ger objec­tionable – sub­mis­si­on. A near-per­fect score was only achie­ved by revi­sing the solu­ti­on with ChatGPT, which signals to stu­dents that they need to rely on AI sup­port to achie­ve a top score.“
  3. The tool has fun­da­men­tal short­co­mings, seve­ral of which the stu­dy clas­si­fies as “fatal obs­ta­cles to use”. It is poin­ted out that most of the obser­ved short­co­mings are due to the inher­ent tech­ni­cal cha­rac­te­ristics and limi­ta­ti­ons of lar­ge lan­guage models (LLMs). For the­se reasons, a quick tech­ni­cal solu­ti­on to the short­co­mings is not to be expected.“

The stu­dy refers to the use of lan­guage models by tea­chers. This should a use by experts with cor­re­spon­ding expe­ri­ence and exper­ti­se in the imple­men­ta­ti­on of assessments.

The lar­ge­ly pro­fes­sio­nal­ly unre­flec­ted demand for the nati­on­wi­de pro­vi­si­on of so-cal­led AI tools can be found both in the press and in asso­cia­ti­ons. Our media cen­ter actual­ly pro­vi­des tea­chers at schools run by the dis­trict with such access. I would now con­sider lin­king this pro­vi­si­on to pri­or man­da­to­ry trai­ning and awareness-raising.

With regard to use by stu­dents, Jep­pe Klit­gaard Stri­cker has made some remar­kab­le the­ses and obser­va­tions for me:

  1. Intellec­tu­al mir­ro­ring (stu­dents uncon­scious­ly adop­ting AI speech patterns)
  2. Digi­tal depen­den­cy dis­or­der (stu­dents panic when AI tools are unavailable)
  3. The illu­si­on of mas­tery (stu­dents thin­king they under­stand becau­se AI explai­ned it)
  4. Col­la­bo­ra­ti­ve intel­li­gence decay (stu­dents aban­do­ning human brain­stor­ming when AI is faster)
  5. Rea­li­ty-prompt con­fu­si­on (stu­dents vie­w­ing real-life chal­lenges as prompts to optimize)
  6. Know­ledge con­fi­dence cri­sis (stu­dents doubting human wis­dom vs AI certainty)
  7. AI-indu­ced per­fec­tion­ism (the pres­su­re to match AI’s flaw­less outputs)

I would like to replace the word “stu­dents” with the word “lear­ners” here, becau­se many of the points are likely to app­ly to adults as well. This per­spec­ti­ve is quite new to me, becau­se up to now I have ten­ded to take a cogni­ti­ve-theo­re­ti­cal approach in my cri­ti­cism of the use of lan­guage models in the classroom:

In a nuts­hell: Our working memo­ry con­ta­ins what we are curr­ent­ly thin­king. Among other things, it is fed by what we have trans­fer­red to our long-term memo­ry over the cour­se of our lives. The degree of net­wor­king of this know­ledge in long-term memo­ry is grea­ter for expe­ri­en­ced peo­p­le (experts) than for inex­pe­ri­en­ced peo­p­le (novices). The out­put of lan­guage models over­loads the capa­ci­ty of the working memo­ry of novices much fas­ter than that of experts, becau­se the­re is less com­pen­sa­ti­on through pre-net­work­ed know­ledge from long-term memory.

Of cour­se, AI can be used at any stage, e.g. when wri­ting semi­nar papers. Howe­ver, the ext­ent to which this makes sen­se for novices with a very hete­ro­ge­neous degree of net­wor­king – which is how lear­ning groups are com­po­sed – in long-term memo­ry must be exami­ned very carefully.

Taking into account the pre­vious pre­mi­ses, lan­guage models can only be used to pro­mo­te lear­ning if the novices alre­a­dy have a cer­tain amount of net­work­ed pri­or know­ledge. For me, it would be irre­spon­si­ble to focus tea­ching sole­ly on the level of use and operation.

Experts, on the other hand, are pro­ba­b­ly much bet­ter at eva­lua­ting the out­put of lan­guage models, but wit­hout a basic under­stan­ding of their func­tion, they can­not use them in a reflec­ti­ve man­ner. Who, for exam­p­le, has the same text eva­lua­ted seve­ral times by an AI tool and then com­pa­res the out­puts with each other, as was done in the stu­dy cited? What’s more, the mar­ke­ting pro­mi­se of time savings quick­ly beco­mes obso­le­te. Experts also tend to be „sus­cep­ti­ble“ to the mecha­nisms for­mu­la­ted by Stricker.

2. Products of AI applications are the new plastic and contaminate the communication space of the Internet

Linux Lee, among others, came up with the idea of see­ing gene­ra­ti­ve AI pro­ducts as ana­log­ous to pla­s­tic made from cru­de oil. Just as the petro­le­um pro­duct fills our tan­gi­ble world, the pro­ducts of gene­ra­ti­ve AI (music, images, vide­os, texts, etc.) fill the com­mu­ni­ca­ti­ve space of the internet.

In the cour­se of sus­taina­bi­li­ty thin­king, pla­s­tic quick­ly falls into a nega­ti­ve cor­ner, but as a mate­ri­al it is indis­pensable in many are­as of modern socie­ty. One major dif­fe­rence is what can be done with exis­ting pla­s­tic. In prin­ci­ple, pla­s­tic made from cru­de oil can be recy­cled, but this is neither eco­no­mic­al­ly via­ble nor are the­re any cor­re­spon­ding con­trol mecha­nisms in the pro­duc­tion and recy­cling chain that would make this pos­si­ble. With a well-struc­tu­red pla­s­tic cycle, mul­ti­ple use of the mate­ri­al is con­ceiva­ble in prin­ci­ple wit­hout any major loss of quality.

The more pro­ducts of gene­ra­ti­ve AI enter the com­mu­ni­ca­ti­on space of the inter­net, the more likely it is that they them­sel­ves will beco­me the actu­al trai­ning basis for AI. This is refer­red to as the „rebound effect“. More or less humo­rous­ly, the the­sis was for­mu­la­ted in rela­ti­on to the edu­ca­ti­on sys­tem that at some point a „tea­cher AI“ will eva­lua­te the „AI home­work“ of the stu­dents. Iro­ni­cal­ly, the stu­dy by Mühl­hoff and Hen­ningsen pro­vi­des „initi­al evi­dence“ of pre­cis­e­ly this. In con­trast to pla­s­tic made from cru­de oil, the resour­ce „pro­duct of a gene­ra­ti­ve AI“ is not real­ly limi­t­ed if, for exam­p­le, rene­wa­ble ener­gy is used to pro­du­ce it. This means that the­re is no real inte­rest or even a need to regu­la­te the­se pro­ducts. The cri­ti­cal view of AI in an edu­ca­tio­nal con­text alo­ne is defi­ni­te­ly asso­cia­ted with hosti­li­ty towards innovation.

This in turn has to do with the fact that AI is often not view­ed in a dif­fe­ren­tia­ted way: Using simi­lar com­pu­ter sci­ence mecha­nisms, AI can gene­ra­te lan­guage or cal­cu­la­te pro­te­in struc­tures very effi­ci­ent­ly in the deve­lo­p­ment of medi­ci­nes. The­se can beco­me sus­tainable pro­ducts, as is also pos­si­ble with pla­s­tic made from cru­de oil. Both „are“ AI.

I would eva­lua­te the lat­ter use of AI very dif­fer­ent­ly, as the resul­ting pro­duct is effec­ti­ve on a com­ple­te­ly dif­fe­rent level. I miss this dif­fe­rence in per­spec­ti­ve in the social dis­cus­sion. In the edu­ca­ti­on sec­tor in par­ti­cu­lar, the topic is usual­ly satu­ra­ted with mar­ke­ting and buz­zwords and usual­ly rea­ches a tar­get group that is not suf­fi­ci­ent­ly edu­ca­ted in infor­ma­ti­on technology.

Yes, what can you do? AI is here to stay!

… and does­n’t go away again. In my last gra­dua­ti­on speech at my son’s school, I descri­bed how being able to choo­se is a luxu­ry situa­ti­on. In fact, you can choo­se not to use lan­guage models in class. Per­so­nal­ly, I find it dif­fi­cult to give lon­ger text pro­duc­tions as home­work – I pre­fer to do this in class, e.g. in com­bi­na­ti­on with col­la­bo­ra­ti­ve wri­ting tools. The resul­ting pro­ducts are alre­a­dy an inde­pen­dent achie­ve­ment. An ortho­gra­phic and gram­ma­ti­cal „fol­low-up check“ using ki-based tools works very well. Espe­ci­al­ly in the inter­me­dia­te level, the skills for eva­lua­ting „AI inter­ven­ti­ons“ in this area should, in prin­ci­ple, have alre­a­dy occur­red in school life and be „pre-net­work­ed“ in long-term memo­ry – actually.

One of the main tasks of edu­ca­ti­on will be how to com­mu­ni­ca­te that cer­tain things should be mas­te­red befo­re AI is used – pre­cis­e­ly becau­se the machi­ne can do it so much bet­ter. And not just for stu­dents, but abo­ve all for us teachers.

When we think about this, we very quick­ly end up with struc­tu­ral con­side­ra­ti­ons about the ger­man edu­ca­ti­on sys­tem itself.

Oh, Lui­se, stop … that’s too broad a field.“ (Theo­dor Fon­ta­ne, Effi Briest, last sentence)