WEBVTT

00:00:00.000 --> 00:00:13.840
So, things I want to talk about today is this.

00:00:13.840 --> 00:00:19.620
Storytelling in digital humanities and people who know me will know I will always talk about

00:00:19.620 --> 00:00:21.120
fiction as well.

00:00:21.120 --> 00:00:26.820
So even real world stuff for me is very much connected to fiction.

00:00:27.220 --> 00:00:31.620
I'm one of the people who does a bit more fundamental stuff today.

00:00:31.620 --> 00:00:36.740
So just reflecting a little bit on when we talk about digital humanities, what is it

00:00:36.740 --> 00:00:38.040
we actually mean?

00:00:38.040 --> 00:00:41.640
Because there are so many things we can do in digital humanities.

00:00:41.640 --> 00:00:49.860
So we can archive stuff, what we do can be about preserving things, making stuff accessible.

00:00:49.860 --> 00:00:55.100
And here, for instance, this is a manuscript that Charles Dickens wrote, and manuscripts

00:00:55.180 --> 00:01:01.380
are very precious things that we want to conserve, and therefore digitalisation is a very important

00:01:01.380 --> 00:01:02.380
bit.

00:01:02.380 --> 00:01:05.060
But just archiving, preserving isn't enough.

00:01:05.060 --> 00:01:08.560
We also then need to study what's going on.

00:01:08.560 --> 00:01:12.940
And it's about creating as well, and experiencing.

00:01:12.940 --> 00:01:18.220
So here the example of the 3D printing is that sometimes when you go to museums, it's

00:01:18.220 --> 00:01:24.140
all nice and well if everything is behind glass, but it's also about experiencing.

00:01:24.180 --> 00:01:30.500
And so some of the digital methods that we have can help us experience reality in a new

00:01:30.500 --> 00:01:31.500
way.

00:01:31.500 --> 00:01:35.140
It's also about understanding the history on something I only learned recently in our

00:01:35.140 --> 00:01:36.940
Rechenzentrum here.

00:01:36.940 --> 00:01:40.340
We actually have an old Zuse computer.

00:01:40.340 --> 00:01:42.260
And you can go there and look.

00:01:42.260 --> 00:01:44.620
Yeah, we will go and have a look at this.

00:01:44.620 --> 00:01:45.820
Absolutely.

00:01:45.820 --> 00:01:50.620
And the very nice person who looks after that machine is very happy if we come and have

00:01:50.620 --> 00:01:52.500
a look and get a little demonstration.

00:01:52.500 --> 00:01:53.620
So this is so cool.

00:01:54.100 --> 00:01:57.020
So anyway, this is happening here as well.

00:01:57.020 --> 00:02:03.820
Now my focus in digital humanities is very much on language and data and how language

00:02:03.820 --> 00:02:09.260
can be data and what data really is, linguistic data and all of that.

00:02:09.260 --> 00:02:13.060
And that seems to be a very fashionable thing to do these days.

00:02:13.060 --> 00:02:14.060
Everything is about data.

00:02:14.060 --> 00:02:18.820
And what you then get an awful lot is people now talk about storytelling when it comes

00:02:18.820 --> 00:02:19.820
to data.

00:02:19.820 --> 00:02:22.660
And then you see there are things like this.

00:02:22.700 --> 00:02:26.820
You probably might have seen something like this where people say, OK, data science, what

00:02:26.820 --> 00:02:32.020
we do there is we collect data, we prepare data, then we visualize it, we analyze it,

00:02:32.020 --> 00:02:35.980
and then we need to tell a story with our data.

00:02:35.980 --> 00:02:37.460
I think this is a good argument.

00:02:37.460 --> 00:02:41.700
Yes, if you don't have a story, you can't explain much to people.

00:02:41.700 --> 00:02:47.980
But I think what we shouldn't forget is if we take a humanities perspective, the data

00:02:48.020 --> 00:02:53.580
we looked at already had a story before we turned it into data.

00:02:53.580 --> 00:02:58.300
And then we kind of pretend that it's just data, a lot of collection, and then afterwards

00:02:58.300 --> 00:03:02.820
we look at it and think, how am I going to tell a story with the graphs that I now have?

00:03:02.820 --> 00:03:08.540
And what I want to argue is that we need to make that connection clearer.

00:03:08.540 --> 00:03:13.900
So stories don't just turn into data and then you forget the context.

00:03:13.900 --> 00:03:16.880
We need to look at that context as well.

00:03:16.880 --> 00:03:21.560
So here are some things I just want to leave with you in the fundamental section.

00:03:21.560 --> 00:03:25.800
When we talk about writing, writing is not the same as text production.

00:03:25.800 --> 00:03:30.280
And I will say a little bit about Chet GPT obviously at the end as well.

00:03:30.280 --> 00:03:33.680
Also reading is not the same as knowledge extraction.

00:03:33.680 --> 00:03:38.820
It isn't just about can I summarize this text and know what the three key messages are.

00:03:38.820 --> 00:03:45.600
Reading is also a process that does lots of other things than coming up with a summary.

00:03:45.600 --> 00:03:49.140
There's always creativity and criticality involved.

00:03:49.140 --> 00:03:55.640
So you have the difference between product that might turn into data and process of actually

00:03:55.640 --> 00:03:59.960
engaging with it as a human being for all sorts of other benefits.

00:03:59.960 --> 00:04:06.720
So what I want to argue is we still need theoretical foundations and we also still need qualitative

00:04:06.720 --> 00:04:11.480
analysis, however many computers we have and however many programs we can apply.

00:04:11.480 --> 00:04:16.360
And what I want to talk about today is basically three principles of creativity.

00:04:16.360 --> 00:04:21.080
So that is a bit like stuff that I would do in Introduction to Digital Humanities week

00:04:21.080 --> 00:04:27.640
one and two, but very quickly today in less than half an hour hopefully.

00:04:27.640 --> 00:04:34.560
A good example for this is to take a text that everyone or a lot of people know, so

00:04:34.560 --> 00:04:39.560
you won't be surprised if you just look at this for a tiny little moment.

00:04:39.640 --> 00:04:45.400
If we look at that text, some words that are interesting to count and look at are unsurprisingly

00:04:45.400 --> 00:04:52.560
words like witch, which here from the Oxford English Dictionary when you want to look at

00:04:52.560 --> 00:04:59.600
the definition is a female magician, a sorceress, and the later use especially a woman supposed

00:04:59.600 --> 00:05:03.040
to have dealings with the devil or evil spirits.

00:05:03.040 --> 00:05:05.160
Okay, that's an interesting one.

00:05:05.160 --> 00:05:12.520
If we look at the male counterpart, we have a wise man.

00:05:12.520 --> 00:05:19.800
Slightly evaluative, little bit different in terms of, you know, no comment.

00:05:19.800 --> 00:05:27.160
But what's interesting is if we now look at this in general language and my go-to place

00:05:27.160 --> 00:05:30.120
for general language is always the British National Corpus.

00:05:30.120 --> 00:05:34.880
So this is taken from the older version but there's a newer version as well and it doesn't

00:05:34.880 --> 00:05:36.480
change much.

00:05:36.480 --> 00:05:40.440
If you look at the British National Corpus and you count all the occurrences of wizard

00:05:40.440 --> 00:05:48.280
and witch, you will find that put together the percentage for the female form is more

00:05:48.280 --> 00:05:51.440
frequent than the male form.

00:05:51.440 --> 00:05:56.480
And that's to have something to do with the fact that we've got this little bit of evaluative

00:05:56.480 --> 00:05:58.880
meaning going on here.

00:05:58.880 --> 00:06:03.920
Okay, if we then look at the corpus, this is a bit small but I explain what it is, that

00:06:03.960 --> 00:06:09.760
is the Oxford Corpus of children's literature and children's literature published after

00:06:09.760 --> 00:06:10.760
2000.

00:06:10.760 --> 00:06:14.720
So we did some work with the University of Oxford where they've got this Oxford Corpus

00:06:14.720 --> 00:06:18.240
and it's just, you know, the stuff that you can buy in Waterstones.

00:06:18.240 --> 00:06:24.840
And if you look at that, it's also that somehow witches are better for fictional stories than

00:06:24.840 --> 00:06:26.580
wizards.

00:06:26.580 --> 00:06:31.480
And if you go further back in history, in 19th century, even more so.

00:06:31.480 --> 00:06:33.560
Okay, so we see there's a bit of a change.

00:06:34.200 --> 00:06:38.400
You know, older stories, more witches, more recent, less so.

00:06:38.400 --> 00:06:43.880
And you can also then relate this if you just look at the pronouns in these books and see

00:06:43.880 --> 00:06:47.160
is there maybe a gendered relationship as well.

00:06:47.160 --> 00:06:52.720
And we've used the pronouns kind of as proxies for male and female characters of people.

00:06:52.720 --> 00:07:01.920
And then you see that it turns round, you know, so there you have more male people characters

00:07:02.080 --> 00:07:07.080
and fewer female people, but you have more witches than wizards.

00:07:07.080 --> 00:07:09.320
Well, what does that say?

00:07:09.320 --> 00:07:13.840
Anyway, now here my first fundamental thing, three principles of creativity.

00:07:13.840 --> 00:07:17.200
The first one is the principle of minimal departure.

00:07:17.200 --> 00:07:22.360
When you deal with fiction, we assume that unless the text tells us otherwise, the fictional

00:07:22.360 --> 00:07:26.920
world is like the real world because we can't spell out everything in the fictional story.

00:07:26.920 --> 00:07:29.240
We need to make some assumptions.

00:07:29.240 --> 00:07:34.240
And what I just showed you about these pronouns, people talk about men more than they talk

00:07:34.240 --> 00:07:40.040
about women that has something to do with how we see the world.

00:07:40.040 --> 00:07:44.800
So all the biases that we have in the real world, we very easily get them in fiction

00:07:44.800 --> 00:07:45.800
as well.

00:07:45.800 --> 00:07:51.000
So the argument is that by looking at more than one fictional world, we see evidence

00:07:51.000 --> 00:07:56.240
of the real world because we just take the patterns from the real world into fiction.

00:07:57.200 --> 00:08:00.520
Why do I need all this to talk about Harry Potter?

00:08:00.520 --> 00:08:07.320
If you look at Harry Potter, suddenly wizard is more frequent than witch.

00:08:07.320 --> 00:08:09.720
Now you can think about why might this be so?

00:08:09.720 --> 00:08:12.480
Why do we now have wizards?

00:08:12.480 --> 00:08:16.280
What's happening there?

00:08:16.280 --> 00:08:17.440
Any ideas?

00:08:17.440 --> 00:08:23.020
If you look at the pronouns, it all becomes clear that they're exactly the same.

00:08:23.500 --> 00:08:27.740
The male characters are more frequent than the female characters.

00:08:27.740 --> 00:08:31.700
What it is in Harry Potter is just that you build this whole Harry Potter world of the

00:08:31.700 --> 00:08:33.340
wizards and witches.

00:08:33.340 --> 00:08:36.260
And basically that world works exactly in the same way.

00:08:36.260 --> 00:08:39.900
Obviously, the men are more frequently talked about than the women.

00:08:39.900 --> 00:08:45.660
And if you look at concordances, you see the wizard is always the person who's the greatest,

00:08:45.660 --> 00:08:46.660
the most powerful.

00:08:46.660 --> 00:08:48.740
You know, with the men, it's about that.

00:08:48.740 --> 00:08:54.060
And if we look at the witches, witches are mothers, as we would expect.

00:08:54.060 --> 00:08:59.100
And then they read this top notch literature like Witch Weekly, which is a bit like, OK,

00:08:59.100 --> 00:09:00.820
magazine or something like that.

00:09:00.820 --> 00:09:03.020
You know, so no stereotypes spared.

00:09:03.020 --> 00:09:04.020
Yeah.

00:09:04.020 --> 00:09:08.020
But just showing you this for these principles.

00:09:08.020 --> 00:09:14.260
Now what I'm arguing is that you can also still change the world with fiction.

00:09:14.260 --> 00:09:15.660
We don't need to accept this.

00:09:15.660 --> 00:09:18.300
It depends on what we do with this.

00:09:18.460 --> 00:09:24.180
These gendered patterns are patterns that are negotiated in the language.

00:09:24.180 --> 00:09:25.180
OK?

00:09:25.180 --> 00:09:30.260
Patterns have to do with evaluative meaning and bias.

00:09:30.260 --> 00:09:33.340
And patterns occur in fiction.

00:09:33.340 --> 00:09:38.740
Fiction is the place where we negotiate meanings because we can think about things that could

00:09:38.740 --> 00:09:41.620
hypothetically take place.

00:09:41.620 --> 00:09:46.740
If you look at children's literature, there's a lot about how you should represent more

00:09:46.780 --> 00:09:50.580
diverse families, how you deal with people from different backgrounds.

00:09:50.580 --> 00:09:52.020
There's a lot of thinking going on.

00:09:52.020 --> 00:09:56.420
And we looked with some colleagues at these books here.

00:09:56.420 --> 00:09:57.460
Does anyone know these?

00:09:57.460 --> 00:10:01.580
Murder Most Unladenlike in the UK is the absolute craze at the moment.

00:10:01.580 --> 00:10:04.500
That is Robin Stevens and everyone.

00:10:04.500 --> 00:10:09.380
You know, if you've got kids in the UK, you must have these books.

00:10:09.380 --> 00:10:13.020
So what it is, it's kind of obviously boarding school story.

00:10:13.020 --> 00:10:16.460
But this time it's two girls who are detectives.

00:10:16.500 --> 00:10:25.100
And the whole book and the whole series is really about these girls behaving in a very

00:10:25.100 --> 00:10:29.420
tough way, solving these murder mysteries.

00:10:29.420 --> 00:10:34.140
If you look at the Murder Most Unladenlike, these are the figures for the pronouns again.

00:10:34.140 --> 00:10:39.900
There, the world is actually changed and turned into something else because now you have more

00:10:39.900 --> 00:10:40.900
female characters.

00:10:40.900 --> 00:10:44.820
So the female pronouns, 68% versus the male pronouns.

00:10:44.820 --> 00:10:49.140
So here you have a really female world.

00:10:49.140 --> 00:10:56.500
And then you have texts, text sections that talk about these characters in a way that

00:10:56.500 --> 00:11:04.220
really stress these gendered situations, stereotypes, things that want to be changed.

00:11:04.220 --> 00:11:09.500
So here's a typical example because we looked at the word unladenlike because that's part

00:11:09.500 --> 00:11:13.500
of the title and we thought, why is this an interesting word to look at?

00:11:13.740 --> 00:11:20.580
And if you look at unladenlike in general language, again, you get so much, you learn

00:11:20.580 --> 00:11:28.060
so much about stereotypes and how people think women should behave, sit, walk, think, dress.

00:11:28.060 --> 00:11:29.060
It is amazing.

00:11:29.060 --> 00:11:31.980
I mean, if you look at all this, you don't want to get up in the morning anymore because

00:11:31.980 --> 00:11:33.620
it's just so complex.

00:11:33.620 --> 00:11:38.860
But here in the book, you then have things where the book always makes a point of they

00:11:38.860 --> 00:11:43.100
do something and that is decidedly unladenlike.

00:11:43.100 --> 00:11:44.860
And that is positive.

00:11:44.860 --> 00:11:48.180
So doing stuff in an unladenlike way is good.

00:11:48.180 --> 00:11:50.220
So here we've got something.

00:11:50.220 --> 00:11:53.620
They just have a really great insight into who it might be.

00:11:53.620 --> 00:11:59.740
And here you see they also said what she would have called Sherlock-y, so from Sherlock Holmes.

00:11:59.740 --> 00:12:04.660
So they always use things that are typically male to describe what great achievements they

00:12:04.660 --> 00:12:05.660
come up with.

00:12:05.660 --> 00:12:09.620
And then at some point she said Daisy said something extremely unladenlike.

00:12:09.860 --> 00:12:16.100
Obviously the book is PC, so you don't hear what she said, but it was extremely unladenlike.

00:12:16.100 --> 00:12:19.060
So you can imagine which words this might have been.

00:12:19.060 --> 00:12:21.520
So the unladenlike is a point we want to make.

00:12:21.520 --> 00:12:28.420
And that is the second principle of creativity that sometimes in the text, you highlight

00:12:28.420 --> 00:12:33.100
parts of the text that receive more emphasis than others.

00:12:33.100 --> 00:12:37.980
And this textual highlighting creates some psychological foregrounding.

00:12:37.980 --> 00:12:43.620
So this is a story where really the gender topic, you can't easily escape this.

00:12:43.620 --> 00:12:47.740
And you have these little parts in the text where something is really highlighted for

00:12:47.740 --> 00:12:49.220
you to not miss it.

00:12:49.220 --> 00:12:55.500
Anyway, the next principle I want to show you is something I need to prepare a bit.

00:12:55.500 --> 00:13:01.300
And that is stories are generally about people.

00:13:01.300 --> 00:13:05.940
We have, again, in children's books, we have stories where animals are the protagonists,

00:13:05.940 --> 00:13:09.340
but then, funnily enough, these animals always behave like people.

00:13:09.340 --> 00:13:12.500
So really stories are about people.

00:13:12.500 --> 00:13:16.740
And there are typical ways of talking about people and describing people.

00:13:16.740 --> 00:13:20.760
And body language is a very common ingredient of fiction.

00:13:20.760 --> 00:13:26.460
So now I want to say something about patterns that are common and typical and repeated.

00:13:26.460 --> 00:13:30.140
And people who normally will know there's no way of saying this without talking about

00:13:31.140 --> 00:13:36.540
that is the web app that we've developed, especially for the study of fiction.

00:13:36.540 --> 00:13:37.900
And I won't go into the details.

00:13:37.900 --> 00:13:41.700
It has lots of functionalities, but we don't need the functionalities here.

00:13:41.700 --> 00:13:46.700
I just wanted to see if you've got a mobile device that is connected to the internet,

00:13:46.700 --> 00:13:49.880
we could just together quickly look at an example.

00:13:49.880 --> 00:13:57.940
If you go to click bmac.uk, our mobile friendly version, if there is enough Wi Fi to actually

00:13:57.940 --> 00:14:06.420
do this, if you go to the click bmac.uk, you should see something like this.

00:14:06.420 --> 00:14:07.860
Is that something you can see?

00:14:07.860 --> 00:14:08.860
If anyone can see?

00:14:08.860 --> 00:14:10.740
I see Anastasia is nodding.

00:14:10.740 --> 00:14:12.980
That is good.

00:14:12.980 --> 00:14:14.980
People in the back, anyone?

00:14:14.980 --> 00:14:15.980
Yeah, yeah.

00:14:15.980 --> 00:14:19.700
People in front are nodding because then if you click on this little box up here, you

00:14:19.700 --> 00:14:21.260
then get these options.

00:14:21.260 --> 00:14:22.260
Yeah.

00:14:22.260 --> 00:14:25.140
The little, yeah, that little box.

00:14:25.820 --> 00:14:26.820
Yeah.

00:14:26.820 --> 00:14:30.380
This little box, yeah.

00:14:30.380 --> 00:14:34.940
You then get options and of these options, I want you to click on concordance.

00:14:34.940 --> 00:14:39.860
And then in the concordance, please go to the search, the corpora, then you get a little

00:14:39.860 --> 00:14:46.660
drop down and you just tick Dickens' novels.

00:14:46.660 --> 00:14:53.060
And then you put in for the search term, the word hands.

00:14:53.060 --> 00:14:55.500
And then it depends on how your phone works.

00:14:55.500 --> 00:14:57.820
For some phones, you just have to click refresh.

00:14:57.820 --> 00:15:03.460
For others, you have to do other things, but you know your phone, how you make it do.

00:15:03.460 --> 00:15:07.900
Can you see it, love?

00:15:07.900 --> 00:15:12.580
And it doesn't just talk to the person sitting next to you if you're unsure.

00:15:12.580 --> 00:15:15.380
Does it work?

00:15:15.380 --> 00:15:17.140
I see some people smiling.

00:15:17.140 --> 00:15:20.260
Is that desperation or does it mean it works?

00:15:20.260 --> 00:15:21.260
It's okay.

00:15:21.260 --> 00:15:22.260
Okay.

00:15:22.260 --> 00:15:25.180
So you can see then something like this.

00:15:25.180 --> 00:15:26.180
Yeah.

00:15:26.180 --> 00:15:31.620
And that is what we call a concordance or similar to what I showed you for wizard and

00:15:31.620 --> 00:15:32.780
witch.

00:15:32.780 --> 00:15:39.020
And this is just a sequence of occurrences of the word hands in Charles Dickens.

00:15:39.020 --> 00:15:41.220
And these occur in the order of the book.

00:15:41.220 --> 00:15:47.340
And then we have some other fancy functions where you can, for example, quick group and

00:15:47.820 --> 00:15:54.900
say I want to see which possessives are in front of or are occurring after the hands

00:15:54.900 --> 00:15:56.140
or something like this.

00:15:56.140 --> 00:15:58.220
And then you can see patterns.

00:15:58.220 --> 00:16:05.380
And the reason why I've used the word hands is there are patterns that occur really repeatedly,

00:16:05.380 --> 00:16:12.140
not just in one Dickens novel, not just in Charles Dickens, but across fiction.

00:16:12.140 --> 00:16:16.420
And a typical pattern is putting his hands in his pockets.

00:16:16.940 --> 00:16:19.980
So putting his hands in his pockets is like something like this.

00:16:19.980 --> 00:16:25.580
And that is typical male behavior in a lot of fictional texts.

00:16:25.580 --> 00:16:32.800
And what I wanted to illustrate with this little example is the third principle of creativity.

00:16:32.800 --> 00:16:37.020
So creativity needs convention as well.

00:16:37.020 --> 00:16:44.300
So because the language is so patterned, both on lexicogrammatical level and larger levels,

00:16:44.300 --> 00:16:50.140
we need texts to contextualize with other texts.

00:16:50.140 --> 00:16:56.700
So we need to spell out what we've already read or heard somewhere else.

00:16:56.700 --> 00:17:04.260
Because otherwise, your text will sound extremely weird if you don't make these contextualizing

00:17:04.260 --> 00:17:07.340
patterns in fiction that you write.

00:17:07.340 --> 00:17:08.420
Okay?

00:17:08.420 --> 00:17:13.940
So I hope this is okay for the principles, because now I'll show you a few more examples.

00:17:14.340 --> 00:17:20.460
Going back to some of the stuff we have done in children's fiction, where we looked at

00:17:20.460 --> 00:17:25.580
that corpus, the one on the left is the Chile Corpus, so that's 19th century corpus of children's

00:17:25.580 --> 00:17:26.580
fiction.

00:17:26.580 --> 00:17:31.020
So this is the classics, you know, Beatrix Potter, Alice in Wonderland, the stuff that

00:17:31.020 --> 00:17:33.900
we all know and somehow have grown up with.

00:17:33.900 --> 00:17:39.400
And the other one is the OCC, that is the 2000 onwards Oxford Children's Corpus.

00:17:39.400 --> 00:17:42.860
And for me, this was very interesting, because that's the stuff I've read with my son.

00:17:42.860 --> 00:17:47.860
So the one is the one that I grew up with somehow, because it was the classics then,

00:17:47.860 --> 00:17:52.300
and the other one is the stuff that I've been reading now as a parent, which is very interesting

00:17:52.300 --> 00:17:53.300
in itself.

00:17:53.300 --> 00:17:59.300
So what we then did is we said we wanted to find common patterns, repeated patterns, and

00:17:59.300 --> 00:18:01.840
especially the body language common patterns.

00:18:01.840 --> 00:18:06.300
So we did something that was super easy, and I'm giving this example because you will know

00:18:06.300 --> 00:18:07.300
all this.

00:18:07.300 --> 00:18:12.180
You know, Google Ngram Viewer is something that was hyped when it came out, and I think

00:18:12.500 --> 00:18:17.020
since then everyone really knows what Ngrams are, so repeated sequences of words.

00:18:17.020 --> 00:18:23.140
And we wanted to just look at all of them that had possessives in them, so things like

00:18:23.140 --> 00:18:28.060
his hands in his pockets, his face with his hands, covered his face with his.

00:18:28.060 --> 00:18:33.140
So we really just checked what sequences of five words can we find, where we've got body

00:18:33.140 --> 00:18:39.980
part nouns and possessives, so that we can do a little comparison as to female characters

00:18:39.980 --> 00:18:40.980
and male characters.

00:18:40.980 --> 00:18:41.980
One second.

00:18:42.780 --> 00:18:50.180
So we wanted to compare female and male characters, and we also wanted to compare how things

00:18:50.180 --> 00:18:52.540
change over time.

00:18:52.540 --> 00:18:59.020
So what we found is basically in the 19th century fiction, and I can't show you all

00:18:59.020 --> 00:19:06.320
of this, it's more that female characters are very much defined in relation to male

00:19:06.320 --> 00:19:07.620
characters.

00:19:08.260 --> 00:19:15.780
So they throw their arms around other people's neck, or they do things where they grab other

00:19:15.780 --> 00:19:17.620
people or they look at them.

00:19:17.620 --> 00:19:24.300
It's really more in relation to, and otherwise you have things like with tears in her eyes

00:19:24.300 --> 00:19:27.140
or covered her face with her hands or something.

00:19:27.140 --> 00:19:28.740
You see where this is going.

00:19:28.740 --> 00:19:33.660
Whereas men are very much like his hands in his pockets and doing things that are really

00:19:33.660 --> 00:19:37.140
like, you know, everything's possible.

00:19:37.660 --> 00:19:45.300
If you then look at the more contemporary, you see that this kind of changes, and the

00:19:45.300 --> 00:19:49.460
patterns we found that this is important, we just didn't look at these sequences.

00:19:49.460 --> 00:19:56.100
We also looked at a lot of the context, because context explains the meaning of the pattern.

00:19:56.100 --> 00:20:02.460
And you could then see that there was behavior of female characters, so the top frequent

00:20:02.460 --> 00:20:06.540
one is something where female characters stand like this.

00:20:06.940 --> 00:20:12.020
And then the context describes that these female characters are very self-confident,

00:20:12.020 --> 00:20:15.140
and they're really having a go at the person they are talking to.

00:20:15.140 --> 00:20:19.740
So it's not like they put their face behind their hands and don't want to speak to them.

00:20:19.740 --> 00:20:23.180
So it's kind of, you can see how this is a bit different.

00:20:23.180 --> 00:20:29.400
Then you all know this, then there was a time in psychology where people then started talking

00:20:29.400 --> 00:20:33.340
about the Wonder Woman pose and something as a power pose.

00:20:33.340 --> 00:20:37.700
I mean, this is now old news, I don't think we do this as much anymore, but there are

00:20:37.700 --> 00:20:41.140
different power poses for men and for women.

00:20:41.140 --> 00:20:45.620
So if you look something like this, James Bond, he wouldn't stand like this.

00:20:45.620 --> 00:20:48.160
Imagine Daniel Craig like, seriously, no.

00:20:48.160 --> 00:20:54.260
But then you also have later Wonder Woman, which is a bit different from early Wonder

00:20:54.260 --> 00:20:55.260
Woman.

00:20:55.260 --> 00:20:57.420
So there's a lot of interesting stuff going on there.

00:20:57.420 --> 00:21:01.740
What I like most is that people then also talk about real world situations.

00:21:02.140 --> 00:21:04.420
I don't know whether you remember Theresa May.

00:21:04.420 --> 00:21:09.420
She was one of the many prime ministers we lived through in the UK, and at some point

00:21:09.420 --> 00:21:15.540
she really wanted to present herself as this, you know, I've now got the house in order.

00:21:15.540 --> 00:21:19.780
And then obviously the press, as the press would do, mock this immediately.

00:21:19.780 --> 00:21:24.740
Oh, Theresa May appeared to go for a spot of power posing here, and then obviously with

00:21:24.740 --> 00:21:26.780
the comment, and it didn't really help.

00:21:26.780 --> 00:21:31.500
And then you had people like here, David Cameron, that's a really good one.

00:21:31.500 --> 00:21:37.180
It says that then Prime Minister David Cameron went full East End gang boss at the photo

00:21:37.180 --> 00:21:38.180
shoot.

00:21:38.180 --> 00:21:45.020
So I don't know if, this is now a bit of a UK joke, but Gavin, you probably might have

00:21:45.020 --> 00:21:46.660
watched East Enders at some point.

00:21:46.660 --> 00:21:53.300
I definitely have, and East Enders is really, you know, when you see him there, it almost

00:21:53.300 --> 00:21:56.820
looks like, you know, Phil or Grant Mitchell or something like this.

00:21:56.820 --> 00:21:59.220
You just cannot see it somehow.

00:21:59.220 --> 00:22:02.380
But the press is very good at making fun of these things.

00:22:02.380 --> 00:22:07.260
Anyway, what I want to say is that there's quite a lot of connection between what you

00:22:07.260 --> 00:22:12.180
have in fiction and what you have in the real world, either because fiction shows you what

00:22:12.180 --> 00:22:16.740
happens in the real world, or the real world might then make fun of what you have in fiction

00:22:16.740 --> 00:22:18.140
to do other things.

00:22:18.140 --> 00:22:21.700
And this is one I can't talk about, I'll just give this to you in case you want to look

00:22:21.700 --> 00:22:22.700
at this.

00:22:22.700 --> 00:22:27.140
There was a really nice article in The Conversation where they talked about spies are not who

00:22:27.140 --> 00:22:28.940
you think they are.

00:22:28.940 --> 00:22:32.580
And that is interesting in the sense, what do we know about spies?

00:22:32.580 --> 00:22:39.060
A lot of what we think spies are is because we all, many of us, do watch James Bond movies

00:22:39.060 --> 00:22:40.500
and enjoy this very much.

00:22:40.500 --> 00:22:44.660
So we learn about the world also from fiction.

00:22:44.660 --> 00:22:49.460
So it's not just that the world is reflected in fiction, but we learn about stuff that

00:22:49.460 --> 00:22:52.980
we don't have direct access to through fiction.

00:22:53.980 --> 00:22:59.980
Now I just want to use the last couple of minutes to talk a little bit about how we

00:22:59.980 --> 00:23:07.100
get from these principles that help us to describe what we find in our data.

00:23:07.100 --> 00:23:13.540
So the principle of minimal departure, highlighting and contextualizing, these are good principles

00:23:13.540 --> 00:23:16.940
to analyze data and texts.

00:23:16.940 --> 00:23:22.260
And what I'm currently very interested in is how do we get from the understanding of

00:23:22.540 --> 00:23:26.980
text and the analysis of text to the actual production of text?

00:23:26.980 --> 00:23:34.940
And how can the patterns that we have identified help us learn to write better or become a

00:23:34.940 --> 00:23:37.140
novelist should we want to do this?

00:23:37.140 --> 00:23:40.620
And that's a little project that I've also been doing with other colleagues that we've

00:23:40.620 --> 00:23:42.780
called then Click Creative.

00:23:42.780 --> 00:23:47.260
And you see we've already changed the font size to make it look more creative, but yeah.

00:23:48.260 --> 00:23:53.540
What we're doing there is we're developing training materials where we use these insights

00:23:53.540 --> 00:23:59.100
like things like his hands in his pockets or her hands on her hips to actually then

00:23:59.100 --> 00:24:04.780
prepare materials and especially for children who learn how to write stories to say these

00:24:04.780 --> 00:24:08.940
are things maybe you should be aware of or give them little examples and say if someone

00:24:08.940 --> 00:24:11.180
stands like this, what does this look like?

00:24:11.180 --> 00:24:12.180
What do you think?

00:24:12.180 --> 00:24:16.060
What emotion is meant to be expressed in this way?

00:24:16.860 --> 00:24:23.060
We then use our concordances as a way of zooming in, so kind of doing a bit of focused reading.

00:24:23.060 --> 00:24:27.620
So saying I write a story where I want to write descriptions of eyes or so on.

00:24:27.620 --> 00:24:31.940
And it's very helpful to run a concordance on eyes and see what actually happens to then

00:24:31.940 --> 00:24:35.420
get a sense of how could we maybe do this?

00:24:35.420 --> 00:24:38.100
We then also give them little examples.

00:24:38.100 --> 00:24:42.740
The important thing for me to mention is it is a portfolio approach.

00:24:42.740 --> 00:24:49.620
So we are not trying to develop something like these AI writing tools that basically

00:24:49.620 --> 00:24:55.300
do the job for you, but it's more like you need a set of different things, examples,

00:24:55.300 --> 00:24:59.620
skills, practices, and this is what we're trying to put into our portfolio.

00:24:59.620 --> 00:25:05.540
If you have time, we've also done some little videos that you can watch and this is the

00:25:05.540 --> 00:25:07.660
kind of handouts we prepare.

00:25:07.660 --> 00:25:14.300
This was an interesting one where we looked at how do we actually talk about light in

00:25:14.300 --> 00:25:15.580
fiction?

00:25:15.580 --> 00:25:20.540
Do we describe light in terms of what we see?

00:25:20.540 --> 00:25:24.600
So do we think light is a very visual thing?

00:25:24.600 --> 00:25:29.580
But actually light is something that you also hear and describe.

00:25:29.580 --> 00:25:35.380
A lot of the importance of light in fiction is the audio of it, kind of, you know, like

00:25:35.860 --> 00:25:42.020
lighting a match, hearing the fire in the background, having an old oil lamp where the

00:25:42.020 --> 00:25:45.460
sound is somehow different from electric light.

00:25:45.460 --> 00:25:50.620
So these are things that we want to use to help teach creative writing and then we do

00:25:50.620 --> 00:25:55.180
stuff like this where we actually speak to authors who do write fiction.

00:25:55.180 --> 00:25:59.580
So we've done an interview with Essie Fox who's just done a really brilliant book, if

00:25:59.580 --> 00:26:03.900
you're into like gothic stuff, that kind of thing, and then talk about authors, how do

00:26:03.980 --> 00:26:08.140
you actually learn and do these things?

00:26:08.140 --> 00:26:13.660
Now I want to show you, I do have two more minutes, yeah, I want to show you just one

00:26:13.660 --> 00:26:19.540
example of the stuff that we are currently trying to work on in this whole Click Creative

00:26:19.540 --> 00:26:26.540
because you can't help doing stuff, so this is just the handouts quickly, I want to get

00:26:26.540 --> 00:26:29.740
to the next thing, so that's all the stuff that you can download.

00:26:29.740 --> 00:26:34.540
But you can't do all this without actually looking at Shed GPT.

00:26:34.540 --> 00:26:39.260
So we're also trying to see some of these things that we now know because we looked

00:26:39.260 --> 00:26:45.800
at the patterns, might these things help us to write better prompts for things like whether

00:26:45.800 --> 00:26:52.800
that is Shed GPT or any of these other tools and how would you need to do the prompt writing

00:26:52.820 --> 00:26:56.620
so that you actually get what you want to get.

00:26:56.860 --> 00:27:00.420
Some of this is really very strange because in a sense you know exactly what you want

00:27:00.420 --> 00:27:05.340
to get and then you spend all your time trying to explain this to the thing to see whether

00:27:05.340 --> 00:27:09.500
it can actually produce it, where you think if I just written this myself.

00:27:09.500 --> 00:27:15.740
But obviously we need to understand how these prompts can help us because then we can generalise

00:27:15.740 --> 00:27:17.660
from these insights.

00:27:17.660 --> 00:27:21.740
And we don't have enough time so that I could show you the longer narrative stuff, so I

00:27:21.820 --> 00:27:25.700
just show you a very small little example.

00:27:25.700 --> 00:27:32.700
And that is one from asking the thing to write a little poem or the beginning of a little

00:27:32.780 --> 00:27:32.980
poem.

00:27:32.980 --> 00:27:38.100
And that takes us back to the witches from the beginning to think about okay what do

00:27:38.100 --> 00:27:45.100
we get if we want a little poem about a witch and we want the poem to start by describing

00:27:45.720 --> 00:27:47.620
what the witch looks like.

00:27:47.860 --> 00:27:54.860
And also it's a poem for children and ideally written in rhyme and we don't want more than

00:27:55.420 --> 00:27:58.700
two stanzas for now because we just want to see.

00:27:58.700 --> 00:28:02.220
Now have a little read.

00:28:02.220 --> 00:28:05.460
James can you just read it for us?

00:28:05.460 --> 00:28:10.460
In a forest deep where the tall trees sway, lives a witch with a smile that sparkles like

00:28:10.460 --> 00:28:17.460
day, with hair like spun gold and eyes bright as doom, in lavender robes she's both friendly

00:28:17.620 --> 00:28:18.980
and true.

00:28:18.980 --> 00:28:25.980
Her broomsticks are danced with a sweep and a twirl, she gathers her dreams in a bright

00:28:25.980 --> 00:28:31.980
little swirl, with potions of wonder and tales yet untold, she whispers to stars as the night

00:28:31.980 --> 00:28:34.180
turns to gold.

00:28:34.180 --> 00:28:39.180
On a scale of one to five who thinks this is a five star poem?

00:28:39.180 --> 00:28:41.460
Hands up.

00:28:41.460 --> 00:28:43.580
You think it's a five star poem?

00:28:43.580 --> 00:28:46.180
Who thinks it's a one star poem?

00:28:47.180 --> 00:28:51.180
No one who thinks it's maybe three or something like this?

00:28:51.180 --> 00:28:55.180
Yeah, that's kind of a three star.

00:28:55.180 --> 00:29:00.180
So if I did a creative writing class I would probably say good start but maybe that's some

00:29:00.180 --> 00:29:02.180
things we've got to work on.

00:29:02.180 --> 00:29:07.180
And it also funny, I mean again I don't know how much UK TV you guys watch but it sounds

00:29:07.180 --> 00:29:11.180
to me very much like the Aldi Christmas ad.

00:29:11.180 --> 00:29:19.180
No, because they always do this like yeah then Kevin the Carrot and Blahdy Blah goes

00:29:19.180 --> 00:29:20.180
somewhere and does it.

00:29:20.180 --> 00:29:23.180
You know that is the kind of like easy something something.

00:29:23.180 --> 00:29:30.180
And what I was thinking when I tried to give the thing the prompt was really this one.

00:29:30.180 --> 00:29:31.180
Do you know that one?

00:29:31.180 --> 00:29:33.180
James do you want to read that as well?

00:29:34.180 --> 00:29:40.180
The witch had a cat and a hat that was black and long ginger hair and a braid down her back.

00:29:40.180 --> 00:29:46.180
This is a poem for children and that gets really exciting and interesting and some of

00:29:46.180 --> 00:29:50.180
the really exciting and interesting that happens there is actually the cat.

00:29:50.180 --> 00:29:52.180
So the witch had a cat.

00:29:52.180 --> 00:29:56.180
So it's not just about the hat that was black but it is about the cat.

00:29:56.180 --> 00:29:59.180
Everything else is just there to rhyme with cat really.

00:29:59.180 --> 00:30:04.180
But you see how that is already a bit of a different opening from the other ones where

00:30:04.180 --> 00:30:08.180
you get everything that we've always heard about witches put together in as many words

00:30:08.180 --> 00:30:09.180
as we need.

00:30:09.180 --> 00:30:15.180
And here Julia Donaldson who is absolutely brilliant for children's literature you see

00:30:15.180 --> 00:30:21.180
how she sets up the very beginning of the poem in the first line by putting in the cat.

00:30:21.180 --> 00:30:25.180
And that is one thing that so that is part of the stuff I'm trying to do at the moment

00:30:25.180 --> 00:30:29.180
is trying to think about narrative in terms of also the sequencing.

00:30:29.180 --> 00:30:34.180
It isn't just about putting patterns together because they have been set somehow.

00:30:34.180 --> 00:30:40.180
But you need this connection of you're actually wanting to arrive somewhere so it's good and

00:30:40.180 --> 00:30:44.180
well if stuff like, Gavin you mentioned this pseudo-write writes a little chapter for you.

00:30:44.180 --> 00:30:49.180
But if you don't know where you want to end with this it's neither here nor there.

00:30:49.180 --> 00:30:52.180
There's a lot of editing and doing stuff like this.

00:30:52.180 --> 00:30:54.180
Anyway this I just wanted to give you something.

00:30:54.180 --> 00:30:58.180
This is work in progress but you see what's currently happening so you might want to come

00:30:58.180 --> 00:30:59.180
back to this.

00:30:59.180 --> 00:31:03.180
And if you have ideas and if anyone is interested in getting into any of this just let me know.

00:31:03.180 --> 00:31:05.180
So some conclusions quickly.

00:31:06.180 --> 00:31:11.180
What I hope I have shown is that language is not the same as data.

00:31:11.180 --> 00:31:15.180
So rather than putting the bricks together having the house and telling the story after

00:31:15.180 --> 00:31:20.180
I would very much argue and say let's not forget the stories while we're actually looking

00:31:20.180 --> 00:31:21.180
at it.

00:31:21.180 --> 00:31:26.180
Also language is where society negotiates its value system.

00:31:26.180 --> 00:31:33.180
The fact that we have evaluation biases is something that is a snapshot at any one point

00:31:33.180 --> 00:31:38.180
in time but it doesn't have to stay like this and fiction is a place to help change this.

00:31:38.180 --> 00:31:45.180
Language really appears in the form of stories and stories and narratives they can be fictional

00:31:45.180 --> 00:31:46.180
or non-fictional.

00:31:46.180 --> 00:31:51.180
So my James Bond example and spies and all the rest of it and East End gang bosses.

00:31:51.180 --> 00:31:59.180
These boundaries are very fuzzy and therefore fiction also can change the world I strongly

00:31:59.180 --> 00:32:00.180
believe.

00:32:00.180 --> 00:32:07.180
For me as a corpus linguist I think we can provide theories and background stuff that

00:32:07.180 --> 00:32:12.180
we need to make sense in this age of AI that we're currently in.

00:32:12.180 --> 00:32:17.180
This was the last bit we need to think about the patterns of stuff we learn and what that

00:32:17.180 --> 00:32:23.180
can do in the world where chat GPT could take over a lot of the writing and the production.

00:32:23.180 --> 00:32:26.180
But here the question is why do we do this?

00:32:26.180 --> 00:32:28.180
Language always has a purpose.

00:32:29.180 --> 00:32:34.180
Do we write because we need a lot of text so that we have an awful lot of data to analyze

00:32:34.180 --> 00:32:38.180
or do we write because there's a purpose for that text?

00:32:38.180 --> 00:32:44.180
And that is something I want you to think about and there will be more on the things

00:32:44.180 --> 00:32:47.180
I've just mentioned somewhere here at some point.

00:32:47.180 --> 00:32:49.180
Okay and this is where I want to finish.

