INDEX
Explanations
plausible explanations or reasons
variations of the word "plausible" and its related forms
New Auto-Interp
Negative Logits
sterdam
-0.74
ounty
-0.74
angelo
-0.74
usterity
-0.73
enfranch
-0.73
elight
-0.72
zona
-0.72
une
-0.71
eg
-0.70
cellence
-0.69
POSITIVE LOGITS
ausible
0.89
\\\\\\\\
0.86
explanation
0.82
plausible
0.82
excuse
0.81
explanations
0.80
plaus
0.80
den
0.78
guesses
0.76
scenario
0.76
Activations Density 0.028%