INDEX
Explanations
given names followed by surnames
New Auto-Interp
Negative Logits
Error
0.48
procedure
0.46
Для
0.46
Mrs
0.46
http
0.45
0.44
И
0.43
object
0.43
Про
0.42
sns
0.42
POSITIVE LOGITS
Jones
0.68
Miller
0.67
Zelensky
0.66
Johnson
0.66
Clarke
0.64
Franklin
0.64
Smith
0.64
הש
0.63
Skywalker
0.63
Biden
0.63
Activations Density 0.071%