Titles of original works starting with an article are left-stripped (#4549)

Example:

  1. Advanced search for person=Henrik Ibsen → 1 entry in result list, good!
  2. Click on entry to show original works → Most of his original works were listed, I think (plus some aggregations, but that’s another matter), - also very good.
    However, beginning articles are stripped from the title:
    Et dukkehjem is shown as Dukkehjem
    De unges forbund is shown as Unges forbund
    The same bug appeared in v. 1.0, and I wonder if the source of the problem is the number of nonfiling characters encoded in the second indicator in title fields?
1 Like

Thanks for testing and for your comment, @oddrun !

Let me look into this!

Fine! I suspect this bug is a result of the bibliographic data processing, though, not the UI.

1 Like

Hi,

@oddrun you’re right:

  • there is a rule that strip the article from the title in all tags where the skip infiling is present (tags: 130/240/245/730/830/740)

  • in particular, for the title Et dukkehjem (present in the new SVDE), there are a lot of records with tag 240 where the title is entered without the article.

1 Like

In my opinion, we must distinguish between which characters of a title are used in sorting/filing, and the actual title itself. For example, a title field like
240 #3 $aEt dukkehjem
doesn’t mean that the first 3 characters of the title should be removed, only that the first 3 characters should be ignored in sorting procedures, i.e. when finding its place in an alphabetical index or filing system. There, this title should appear in the ‘D’ part.
However the (preferred) title is still Et dukkehjem, and should be rendered as such. I for one think it is very important to show the preferred titles of original works correctly.

1 Like

Dear Oddrun, you’re right: the skip is used for the sorting of the titles.
I try to better argue our decision to apply this rule.
We process millions of records (bibliographic and authority) that come from different libraries, organizations and so on. Each one of them apply different cataloging practices.
As I already affirmed, the title of the opus is coming from tag: 130/240/245/730/830/740, but also from $t of tag 1XX/7XX. So, we can have:
• Bibliographic records where the title is in:

  1. 240 00$aEt dukkehjem
  2. 240 03$aEt dukkehjem
  3. 240 10$aDukkehjem
  4. 700 1 $aIbsen, Henrik,$d1828-1906$tDukkehjem

• Authority records where the title is in:
5. 130 #0 $aDukkehjem
6. 130 #0 $aEt Dukkehjem
7. 130 #3 $aEt Dukkehjem
8. ‎ 100 1# ‡aIbsen, Henrik‏ ‎‡d (1828-1906).‏‡tDukkehjem

Analyzing a large amount of record, we noticed that in most of them the article is not expressed in the related indicator for the skip (see case 1 and 6) and also it’s not reported in the title (see case 3 and 5).
Moreover many opus titles come from $t in tag NT (1XX, 7XX), so in order to reconcile all entities, we decided to clusterize the title without article.

1 Like

I see (I think). So, if every reference (in all the incoming records) to this title had been “Et dukkehjem”, then you would have kept that title in svde also - including the article? Does this mean that it’s not the eventual number in indicator 2 that causes the exclusion of articles, but the fact that some of the incoming data encode titles without a starting article?
I think this is not quite satisfactory in the long run, and it will cause a lot of J.Cricketing… Giving precedence to the records from the country of the author (or the country of first publication) as source of title could solve some of it.

1 Like

Dear Oddrun,
I opened a development ticket to find a solution to this request. Please, in the meantime, can you send a list of norvegian stop words that we can use to improve our rules and algorithm?

Hi, Here is a stopword list for Norwegian (bokmål and nynorsk): Stopwords in Norwegian (by Snowball)

Thanks very much @oddrun .