INDEX
Explanations
the word "guess"
expressions of uncertainty or conjecture
New Auto-Interp
Negative Logits
roma
-0.86
andals
-0.80
natureconservancy
-0.78
verb
-0.76
ĸļ
-0.72
elight
-0.71
vis
-0.71
vertisement
-0.69
cling
-0.69
asca
-0.68
POSITIVE LOGITS
guess
1.22
guesses
1.05
Guess
1.01
guessing
0.98
interpretation
0.79
guessed
0.74
lessly
0.69
RL
0.68
ariat
0.66
excuse
0.66
Activations Density 0.008%