Thanks for testing and for your comment, @oddrun !
Let me look into this!
Fine! I suspect this bug is a result of the bibliographic data processing, though, not the UI.
In my opinion, we must distinguish between which characters of a title are used in sorting/filing, and the actual title itself. For example, a title field like
240 #3 $aEt dukkehjem
doesn’t mean that the first 3 characters of the title should be removed, only that the first 3 characters should be ignored in sorting procedures, i.e. when finding its place in an alphabetical index or filing system. There, this title should appear in the ‘D’ part.
However the (preferred) title is still Et dukkehjem, and should be rendered as such. I for one think it is very important to show the preferred titles of original works correctly.
Dear Oddrun, you’re right: the skip is used for the sorting of the titles.
I try to better argue our decision to apply this rule.
We process millions of records (bibliographic and authority) that come from different libraries, organizations and so on. Each one of them apply different cataloging practices.
As I already affirmed, the title of the opus is coming from tag: 130/240/245/730/830/740, but also from $t of tag 1XX/7XX. So, we can have:
• Bibliographic records where the title is in:
- 240 00$aEt dukkehjem
- 240 03$aEt dukkehjem
- 240 10$aDukkehjem
- 700 1 $aIbsen, Henrik,$d1828-1906$tDukkehjem
• Authority records where the title is in:
5. 130 #0 $aDukkehjem
6. 130 #0 $aEt Dukkehjem
7. 130 #3 $aEt Dukkehjem
8. 100 1# ‡aIbsen, Henrik ‡d (1828-1906).‡tDukkehjem
Analyzing a large amount of record, we noticed that in most of them the article is not expressed in the related indicator for the skip (see case 1 and 6) and also it’s not reported in the title (see case 3 and 5).
Moreover many opus titles come from $t in tag NT (1XX, 7XX), so in order to reconcile all entities, we decided to clusterize the title without article.
I see (I think). So, if every reference (in all the incoming records) to this title had been “Et dukkehjem”, then you would have kept that title in svde also - including the article? Does this mean that it’s not the eventual number in indicator 2 that causes the exclusion of articles, but the fact that some of the incoming data encode titles without a starting article?
I think this is not quite satisfactory in the long run, and it will cause a lot of J.Cricketing… Giving precedence to the records from the country of the author (or the country of first publication) as source of title could solve some of it.
I opened a development ticket to find a solution to this request. Please, in the meantime, can you send a list of norvegian stop words that we can use to improve our rules and algorithm?
Hi, Here is a stopword list for Norwegian (bokmål and nynorsk): Stopwords in Norwegian (by Snowball)
Thanks very much @oddrun .