INDEX
Explanations
references to specific individuals, particularly in the context of media or public figures
New Auto-Interp
Negative Logits
ViewInit
-0.60
Bewußt
-0.57
uș
-0.56
vacacionales
-0.56
térmico
-0.55
ExtendWith
-0.55
odeur
-0.54
Jîn
-0.54
sanitarias
-0.54
redacción
-0.52
POSITIVE LOGITS
TZ
0.47
(“
0.46
0.44
(&
0.42
Fucking
0.42
「
0.42
(‘
0.40
(«
0.40
consultato
0.39
("0.39
Activations Density 0.448%