INDEX
Explanations
verbs related to providing or explaining information, especially in a structured or procedural manner
New Auto-Interp
Negative Logits
Kubrick
-0.70
highlighting
-0.65
olson
-0.63
ãĥ£
-0.63
prone
-0.62
Revised
-0.62
Pound
-0.60
skating
-0.59
heavier
-0.58
Malta
-0.57
POSITIVE LOGITS
iced
1.22
icing
1.01
itude
1.01
ility
0.99
serv
0.97
iments
0.94
ando
0.93
itors
0.92
maid
0.91
itary
0.90
Activations Density 0.017%