INDEX
Explanations
proper nouns and references to specific names or entities
New Auto-Interp
Negative Logits
Edith
-0.54
فريبيس
-0.51
betweenstory
-0.50
ächs
-0.46
pean
-0.46
Viv
-0.45
╯
-0.44
Nutri
-0.44
nack
-0.43
fag
-0.43
POSITIVE LOGITS
Rai
2.59
Rai
2.38
RAI
1.68
rai
1.61
rai
1.34
RAI
1.32
Rae
1.10
Rais
0.94
Rae
0.91
Рай
0.90
Activations Density 0.002%