INDEX
Explanations
descriptive negative situations
New Auto-Interp
Negative Logits
Ing
0.39
𝐧
0.38
allen
0.37
ellingen
0.37
ليس
0.37
ⵍ
0.36
Cheng
0.36
Sim
0.35
ਉ
0.35
𝐄
0.35
POSITIVE LOGITS
polystyrene
0.43
tangle
0.39
puddle
0.39
סה
0.38
sergeant
0.38
sgt
0.38
optimum
0.38
agony
0.38
rained
0.37
terminou
0.37
Activations Density 0.001%