INDEX
Explanations
exact terms or phrases
occurrences of the word "exact"
New Auto-Interp
Negative Logits
Democr
-0.78
ker
-0.77
Flavoring
-0.71
rift
-0.68
Krishna
-0.67
ashore
-0.65
Paw
-0.64
kers
-0.63
Pigs
-0.63
Krug
-0.62
POSITIVE LOGITS
exact
0.94
itude
0.92
wording
0.80
antit
0.76
ensions
0.76
ions
0.76
ident
0.76
amount
0.75
same
0.74
itudes
0.73
Activations Density 0.005%