INDEX
Explanations
expressions of visual assessment or evaluation
looks like an observation
New Auto-Interp
Negative Logits
adhyay
-0.53
ragamo
-0.51
ibrate
-0.51
Nantucket
-0.50
ghum
-0.49
passphrase
-0.49
Vineyard
-0.48
tričko
-0.47
ppuden
-0.47
hostage
-0.47
POSITIVE LOGITS
Looks
1.45
Looks
1.41
looks
1.31
looks
1.31
LOOKS
1.22
seems
0.77
appears
0.71
aussieht
0.70
seems
0.69
Parece
0.69
Activations Density 0.004%