INDEX
Explanations
mentions of specific names related to a certain context or category
names and locations related to specific events or persons, particularly focusing on the term "Tav"
New Auto-Interp
Negative Logits
head
-0.84
ŃĶ
-0.74
phrine
-0.73
autopsy
-0.73
omial
-0.73
Laden
-0.71
dose
-0.69
undai
-0.67
hood
-0.64
ACTED
-0.62
POSITIVE LOGITS
liga
0.85
lain
0.80
ernels
0.79
ues
0.78
HAEL
0.76
ittal
0.75
Uriel
0.75
irtual
0.74
ï¸ı
0.73
comprom
0.73
Activations Density 0.038%