INDEX
Explanations
instances of the word "I" and its variations in the text
New Auto-Interp
Negative Logits
źródło
-0.45
ⓧ
-0.42
Citiți
-0.42
Zustimmung
-0.42
Personendaten
-0.42
Савезне
-0.41
Atentamente
-0.41
Weiterlesen
-0.41
katholischen
-0.40
تقاوى
-0.39
POSITIVE LOGITS
’
0.76
(‘
0.49
.’
0.45
Morpho
0.45
’.
0.44
’).
0.44
…’
0.43
(“
0.43
’?
0.43
IDs
0.41
Activations Density 0.161%