INDEX
Explanations
journey, activity, time, crypto, money
New Auto-Interp
Negative Logits
0.66
University
0.55
H
0.54
trumpet
0.53
the
0.52
års
0.52
chocolate
0.50
venerable
0.50
L
0.50
I
0.50
POSITIVE LOGITS
ⅽ
0.67
并且
0.61
ⅾ
0.55
᧐
0.54
whereas
0.53
voordat
0.53
Ⲛ
0.53
aswell
0.52
而且
0.50
由于
0.50
Activations Density 0.001%