; ; ____ ; ,----.. ,-.----. ,---, .--.--. ,' , `. ; / / \ \ / \ ' .' \ / / '. ,-+-,.' _ | ; | : :; : \ ,---,. / ; '. | : /`. / ,-+-. ; , || ; . | ;. /| | .\ : ,' .' |: : \ ; | |--` ,--.'|' | ;| ; . ; /--` . : |: | ,---.' ,: | /\ \| : ;_ | | ,', | ': ; ; | ; | | \ : | | || : ' ;. :\ \ `. | | / | | || ; | : | | : . / : : .' | | ;/ \ \`----. \' | : | : |, ; . | '___ ; | | \ : |.' ' : | \ \ ,'__ \ \ |; . | ; |--' ; ' ; : .'|| | ;\ \`---' | | ' '--' / /`--' /| : | | , ; ' | '/ :: ' | \.' | : : '--'. / | : ' |/ ; | : / : : :-' | | ,' `--'---' ; | |`-' ; \ \ .' | |.' `--'' | ;/ ; `---` `---' '---' ; ; Prod: CR-ASM ; Size: 512b ; Type: Demotool? ; Platform: MS-DOS (386+) ; Group: The Cronies ; Date: July 6, 2015 ; Contact: orbitaldecay@gmail.com ; ;-~=<|$| Introduction |$|>=~----------------------------------------------------- ; ; This is a self-hosting assembler (i.e. it can assemble it's own source code) ;in 512 bytes! It assembles a Turing-complete subset of the x86 instruction set. ;It accepts the source file via STDIN and writes the machine code to STDOUT. Feel ;free to give it a go: ; ; cr-asm.com cr-asm2.com ; ;Notice that cr-asm2.com is identical to cr-asm.com. You can do ; ; cr-asm2.com cr-asm3.com ; ;ad infinitum (if you're insane and you get your jollies from that sort of thing) ; ;-~=<|$| Caveats |$|>=~---------------------------------------------------------- ; ; There are some caveats to the CR-ASM assembly language. Obviously, this was ;never intended to be seriously used, but is rather a proof of concept. I'll ;enumerate the short-comings up-front for the sake of fairness: ; ; 1. As mentioned earlier, it only implements a small subset of the x86 ; instruction set. ; ; 2. It is CaSe-SeNsItIvE. All alphabetic characters must be capitalized. ; ; 3. It does not tolerate any extraneous white-space (no blank lines, etc.) ; ; 4. There is no support for comments or labels :( This makes it about as fun ; to code in as MS-DOS DEBUG. ; ; 5. There is no support for decimal values. All constants are assumed to be ; hexadecimal. ; ; 6. The source file must end with two blank lines (this signals to the parser ; to exit). I know. It's super lame. ; ; 7. If there is a syntax error in the source code, CR-ASM typically hangs. ; Again, super lame, but what do you want out of a 512 byte assembler? ; ; 8. No support for UNIX style lines. must immediately follow each ; instruction (no white-space after the instruction). ; ; 9. Only registers AX, CX, DX, BX, SP, BP, SI, and DI are supported. No ; support for segment registers. ; ; 10. Many of the mnemonics that CR-ASM uses are not the official mnemonics. ; This was to make parsing the instructions easier. ; ; 11. Other snags that I'm not thinking of at the moment. ; ;-~=<|$| Supported instructions |$|>=~------------------------------------------- ; ; Each instruction must be entered EXACTLY as described here. No extra ;white-space ANYWHERE! In particular, watch out for white space after the end of ;the instruction. ; ;MOV XX, YYYY ; ; Move the hexadecimal value YYYY into the register XX. YYYY must be exactly ; four characters long. Use leading zeros where needed. This is the only form ; of the MOV instruction that is supported. If you need to move values between ; registers, use pushes and pops. ; ;INT YY ; ; Call interrupt YY where YY is a hexadecimal value that is exactly two ; characters long. ; ;PSH XX ; ; Push the register XX onto the stack. Notice the mnemonic deviates from the ; official "PUSH". ; ;POP XX ; ; Pop the stack and store in register XX. ; ;INC XX ; ; Increment register XX. ; ;DEC XX ; ; Decrement register XX. ; ;CMPD ; ; Double-word compare DS:SI to ES:DI and increment SI and DI. Notice the ; mnemonic deviates from the official "CMPSD". ; ;CMPW ; ; Word compare DS:SI to ES:DI and increment SI and DI. Notice the mnemonic ; deviates from the official "CMPSW". ; ;LODW ; ; Read a word from DS:SI into AX. Notice the mnemonic deviates from the ; official "LODSW". ; ;LODB ; ; Read a byte from DS:SI into AL. Notice the mnemonic deviates from the ; official "LODSB". ; ;STOW ; ; Write a word from AX to ES:DI. Notice the mnemonic deviates from the ; official "STOSW". ; ;STOB ; ; Write a byte from AL to ES:DI. Notice the mnemonic deviates from the ; official "STOSB". ; ;JNE YY ; ; If the zero flag is not set, then add YY to the instruction pointer where ; YY is a hexadecimal value that is exactly two characters long. ; ;JMP YYYY ; ; Add YYYY to the instruction pointer where YYYY is a hexadecimal value that ; is exactly four characters long. ; ;CAL YYYY ; ; Push the current instruction pointer and add YYYY to the instruction ; pointer where YYYY is a hexadecimal value that is exactly four characters ; long. Notice the mnemonic deviates from the official "CALL". ; ;RETN ; ; Pop the stack and copy the value to the instruction pointer (near return). ; Notice the mnemonic deviates from the official "RET". ; ;-~=<|$| Pseudo-instructions |$|>=~---------------------------------------------- ; ; In addition to the aforementioned instructions, CR-ASM supports two commands ;which allow the user to write data directly to the output. Notice these commands ;will write the characters which follow precisely (that includes white-space, ;new-lines, etc.). ; ;DSW XX ; ; Write the characters XX to the output file. ; ;DSD XXXX ; ; Write the characters XXXX to the output file. ; ;-~=<|$| Bye-bye! |$|>=~--------------------------------------------------------- ; ; Well, that about wraps it up. Thanks for taking the time to check out my ;little labour of love. I'd love to see someone pull this off in 256b. If you do, ;make sure to drop me a line. Greets go out to sensenstahl, jmph, frag, ;g0blinish, Rrrola, YOLP, and all size-coders. ; ; orbitaldecay 2015 MOV AX, 3F00 MOV BX, 0000 MOV CX, FFFF MOV DX, 02FA INT 21 MOV SI, 02FA MOV DI, 02FA PSH DI MOV CX, 0001 MOV DI, 029E CMPD JNE 03 JMP 0008 INC CX DEC SI DEC SI DEC SI DEC SI JMP FFF1 POP DI DEC CX JNE 13 MOV BX, 00B8 CAL 00F5 CMPW DEC DI DEC DI MOV CX, 0004 CAL 00F7 STOW JMP 00E2 DEC CX JNE 0E MOV AX, 00CD STOB MOV CX, 0002 CAL 00E6 STOB JMP 00D1 DEC CX JNE 09 MOV BX, 0050 CAL 00CE JMP 00C5 DEC CX JNE 09 MOV BX, 0058 CAL 00C2 JMP 00B9 DEC CX JNE 07 MOV AX, A766 STOW JMP 00AF DEC CX JNE 07 MOV AX, 00A7 STOB JMP 00A5 DEC CX JNE 07 MOV AX, 00AD STOB JMP 009B DEC CX JNE 07 MOV AX, 00AB STOB JMP 0091 DEC CX JNE 07 MOV AX, 00AC STOB JMP 0087 DEC CX JNE 07 MOV AX, 00AA STOB JMP 007D DEC CX JNE 09 MOV BX, 0040 CAL 007A JMP 0071 DEC CX JNE 09 MOV BX, 0048 CAL 006E JMP 0065 DEC CX JNE 0E MOV AX, 0075 STOB MOV CX, 0002 CAL 0069 STOB JMP 0054 DEC CX JNE 0E MOV AX, 00E9 STOB MOV CX, 0004 CAL 0058 STOW JMP 0043 DEC CX JNE 0E MOV AX, 00E8 STOB MOV CX, 0004 CAL 0047 STOW JMP 0032 DEC CX JNE 07 MOV AX, 00C3 STOB JMP 0028 DEC CX JNE 05 LODW STOW JMP 0020 DEC CX JNE 07 LODW STOW LODW STOW JMP 0016 MOV AX, 4000 MOV BX, 02FA DEC DI DEC BX JNE FC PSH DI POP CX MOV BX, 0001 MOV DX, 02FA INT 21 INT 20 CMPW DEC DI DEC DI JMP FEEB PSH AX CAL 005C INC AX DEC BX JNE FC STOB POP AX RETN PSH BX MOV BX, 0000 CAL 0015 CAL 0023 INC AX DEC AX JNE 03 JMP 0004 INC BX DEC AX JNE FC DEC CX JNE EC PSH BX POP AX POP BX RETN PSH AX PSH CX MOV CX, 0004 PSH BX POP AX INC BX DEC AX JNE FC DEC CX JNE F7 POP CX POP AX RETN PSH BX MOV AX, 0000 LODB MOV BX, 0030 DEC AX DEC BX JNE FC PSH AX MOV BX, 000A INC AX DEC AX JNE 03 JMP 000E DEC BX JNE F7 POP AX MOV BX, 0007 DEC AX DEC BX JNE FC JMP 0001 POP AX POP BX RETN PSH DI MOV AX, 0000 MOV DI, 02EA CMPW JNE 03 JMP 0006 INC AX DEC SI DEC SI JMP FFF4 POP DI RETN DSD MOV DSD INT DSD PSH DSD POP DSD CMPD DSD CMPW DSD LODW DSD STOW DSD LODB DSD STOB DSD INC DSD DEC DSD JNE DSD JMP DSD CAL DSD RETN DSD DSW DSD DSD DSD DSW AX DSW CX DSW DX DSW BX DSW SP DSW BP DSW SI DSW DI