INDEX
Explanations
references to various stages and scenes in narratives or performances
New Auto-Interp
Negative Logits
armée
-0.56
grecque
-0.48
tisgarh
-0.47
Política
-0.45
kiệm
-0.45
Unidas
-0.44
mathematician
-0.44
Спољашње
-0.43
prieten
-0.43
pañol
-0.43
POSITIVE LOGITS
STAGE
0.85
Stage
0.83
stage
0.81
stages
0.79
Scene
0.79
Stage
0.78
PHASE
0.78
Phase
0.77
scene
0.77
Phase
0.73
Activations Density 0.552%