From 69208efce1a43b3ea2ee93d85e2706ed7ee57581 Mon Sep 17 00:00:00 2001 From: spamegg Date: Tue, 21 Jun 2022 11:47:39 +0300 Subject: [PATCH] added OSTEP course page --- README.md | 2 +- coursepages/ostep/OSTEP.md | 160 ++++++++++++++++ coursepages/ostep/Project-1B-initial-xv6.md | 35 ++++ coursepages/ostep/Reading-order.md | 13 ++ coursepages/ostep/Scheduling-xv6-lottery.md | 202 ++++++++++++++++++++ coursepages/ostep/vm-xv6-intro.md | 13 ++ 6 files changed, 424 insertions(+), 1 deletion(-) create mode 100644 coursepages/ostep/OSTEP.md create mode 100644 coursepages/ostep/Project-1B-initial-xv6.md create mode 100644 coursepages/ostep/Reading-order.md create mode 100644 coursepages/ostep/Scheduling-xv6-lottery.md create mode 100644 coursepages/ostep/vm-xv6-intro.md diff --git a/README.md b/README.md index 1237e82..8a29991 100644 --- a/README.md +++ b/README.md @@ -227,7 +227,7 @@ Courses | Duration | Effort | Additional Text / Assignments| Prerequisites | Dis :-- | :--: | :--: | :--: | :--: | :--: [Build a Modern Computer from First Principles: From Nand to Tetris](https://www.coursera.org/learn/build-a-computer) ([alt](https://www.nand2tetris.org/)) | 6 weeks | 7-13 hours/week | - | C-like programming language | [chat](https://discord.gg/vxB2DRV) [Build a Modern Computer from First Principles: Nand to Tetris Part II ](https://www.coursera.org/learn/nand2tetris2) | 6 weeks | 12-18 hours/week | - | one of [these programming languages](https://user-images.githubusercontent.com/2046800/35426340-f6ce6358-026a-11e8-8bbb-4e95ac36b1d7.png), From Nand to Tetris Part I | [chat](https://discord.gg/AsUXcPu) -[Operating Systems: Three Easy Pieces](https://pages.cs.wisc.edu/~remzi/Classes/537/Spring2018/) | 10-12 weeks | 6-10 hours/week | - | algorithms, [familiarity with C](https://hackr.io/tutorials/learn-c?sort=upvotes&type_tags%5B%5D=1) is useful | [chat](https://discord.gg/wZNgpep) +[Operating Systems: Three Easy Pieces](coursepages/ostep/OSTEP.md) | 10-12 weeks | 6-10 hours/week | - | algorithms, [familiarity with C](https://hackr.io/tutorials/learn-c?sort=upvotes&type_tags%5B%5D=1) is useful | [chat](https://discord.gg/wZNgpep) [Computer Networking: a Top-Down Approach](http://gaia.cs.umass.edu/kurose_ross/online_lectures.htm)| 8 weeks | 4–12 hours/week | [Wireshark Labs](http://gaia.cs.umass.edu/kurose_ross/wireshark.php) | algebra, probability, basic CS | [chat](https://discord.gg/MJ9YXyV) ### Core theory diff --git a/coursepages/ostep/OSTEP.md b/coursepages/ostep/OSTEP.md new file mode 100644 index 0000000..f5475fa --- /dev/null +++ b/coursepages/ostep/OSTEP.md @@ -0,0 +1,160 @@ +# Operating Systems: Three Easy Pieces + +Credit goes to [palladian](https://github.com/palladian1) +## Introduction + +First, we should be frank: it's really hard to find a good self-contained online course on operating systems. OSTEP is the best course we've found so far, but it does have some issues. + +This is the first course in the OSSU curriculum for which you'll need to learn some prerequisites on your own before starting it, in addition to the courses that come before it in the curriculum. You might also run into some issues running the scripts for homework demos and for testing your solutions to the projects (although we hope we've solved most of those by now). + +What this means for you is that if you're under a significant time crunch, or you're just not all that interested in systems programming and OS development, there's no shame in skipping this course and coming back to it later. You could also do only a part of the course (e.g. you might choose to skip the homework and/or projects). + +That said, if you're able to commit the time required for the prerequisites, we believe the reward is well worth the effort: this course is exciting, interesting, and quite useful for other fields of computer science and programming. One big attraction of this course is the opportunity to see a simplified but fully-functional Unix-like operating system in action and understand the concepts and design decisions that went into it as well as the low-level implementation details. + +In order to satisfy OSSU's curricular guidelines, you should either watch all the lecture videos or read chapters 1 through 47 in the textbook (don't worry, the chapters are usually just a few pages long) as well as finish the projects listed below. We also strongly encourage you to do the homework exercises as they're assigned on the course website or in the book chapters; think of these like the "check-your-understanding" questions that pop up in the middle of lecture videos on sites like Coursera or edX. + +## Prerequisites + +This class requires a lot of experience programming in C. You should finish one of the C books listed in the ***resources below*** *before* starting this course; if you try to learn C at the same time as the course material, you're likely to feel overwhelmed. If you haven't used C before, you should expect to spend a lot of time on this; it's hard to predict how long it might take for each person, but a rough estimate might be 8-10 hours per week for 3-5 weeks. You can always learn C alongside another OSSU course or even redo the exercises for other courses in C to gain practice with it. + +You should also finish both parts of Nand2Tetris before starting this course. OSTEP focuses on the real-world x86 and x86_64 architectures, so you'll have to fill in some gaps in order to translate the concepts you learned in Nand2Tetris to a new architecture. You can do that with the x86 resources below, but note that they all assume you know C, so learn that first. This should take around 6-8 hours in total. + +## Course Links + +* [Course website](https://pages.cs.wisc.edu/~remzi/Classes/537/Spring2018/) +* [Book](https://pages.cs.wisc.edu/~remzi/OSTEP/) +* [Lecture videos](https://pages.cs.wisc.edu/~remzi/Classes/537/Spring2018/Discussion/videos.html) +* [Homework](https://pages.cs.wisc.edu/~remzi/OSTEP/Homework/homework.html) +* [Homework repo](https://github.com/remzi-arpacidusseau/ostep-homework) +* [Projects](https://github.com/remzi-arpacidusseau/ostep-projects) +* [xv6](https://github.com/mit-pdos/xv6-public) + +## Roadmap + +This course was originally taught as CS 537 at the University of Wisconsin by the author of the OSTEP textbook, so the projects are assigned in the course according to the best times to give UWisconsin students access to on-campus resources like recitation sections and office hours. That means they don't match up perfectly with the material being covered at that time in the lectures or textbook chapters. We recommend doing the course in the following order instead. + +[Reading order](readingorder.md) + +* Read chapters 1 and 2 of the OSTEP textbook and watch the first half (the introduction) of lecture 1. +* Do the `initial-utilities` project; it's intended as a litmus test for you to make sure you're comfortable enough with C before taking this class. You can watch discussion 1 for help. If it takes you more than 2 hours to write the code (not counting the discussion time and any time spent debugging), you should consider spending more time learning C before moving on in the course. (If you want more practice, you can do `initial-reverse` too, but it's not required.) +* Watch lectures 1 through 5 and read chapters 3 through 24 of the OSTEP textbook. We recommend doing the homework assignments as they come up in the course calendar or book chapters. +* Watch discussion 3 and reread chapter 5, then do the `processes-shell` project. +* Read the annotated guide to xv6 linked in the resources section below, starting from the beginning and stopping after the `System Calls: Processes` section. +* Watch discussion 2, then do the `initial-xv6` project. +* Watch discussion 5, then do the `scheduling-xv6-lottery` project. +* Watch discussion 7, then do the `vm-xv6-intro` project. +* Watch lectures 6 through 9 (and optionally, the review lecture) and read chapters 25 through 34; again, you're encouraged to do the homework. +* Watch discussion 10, then do the `concurency-xv6-threads` project. +* Watch discussions 11 and 12, then do the `concurrency-mapreduce` project. +* Watch lectures through 14 (and optionally, the second review lecture) and read chapters 35 through 47; remember to do the homework along with the lectures or chapters. +* Do the `filesystems-checker` project. + +### Running the Projects + +This course was originally taught as CS 537 at the University of Wisconsin by the author of the OSTEP textbook, which means that the homework and projects were written with those students as a target audience and designed to be run on UWisconsin lab computers with specific software versions pre-installed for students. We hope this section fixes that so you can run them on other computers, but we haven't tested this on every computer, so if you run into other issues, let us know on the [Discord channel](https://discord.gg/MJ9YXyV) and we'll try to help out. + +In order to run the homework and projects on Linux or macOS, you'll need to have all of the following programs installed: + +* `gcc` +* `gas` +* `ld` +* `gdb` +* `make` +* `objcopy` +* `objdump` +* `dd` +* `python` +* `perl` +* `gawk` +* `expect` +* `git` + +You will also need to install `qemu`, but we recommend using the patched version provided by the xv6 authors; see [this link](https://pdos.csail.mit.edu/6.828/2018/tools.html) for details. + +On macOS, you'll need to install a cross-compiler `gcc` suite capable of producing x86 ELF binaries; see the link above for details as well. + +On Windows, you can use a Linux virtual machine for the homework and projects. Some of these packages are not yet supported on Apple M1 computers, and virtual machine software has not yet been ported to the new processor architecture; some students have used a VPS to do the homework and projects instead. + +Next, clone the `ostep-homework` and `ostep-projects` repositories: +```sh +git clone https://github.com/remzi-arpacidusseau/ostep-homework/ +git clone https://github.com/remzi-arpacidusseau/ostep-projects/ +cd ostep-projects +``` + +You'll have to clone [the `xv6-public` repository](https://github.com/mit-pdos/xv6-public) into the directory for each xv6-related OSTEP project. You could use the same copy for all the projects, but we recommend using separate copies to avoid previous projects causing bugs for later ones. Run the following commands in *each* of the `initial-xv6`, `scheduling-xv6-lottery`, `vm-xv6-intro`, `concurrency-xv6-threads`, and `filesystems-checker` directories. + +```sh +mkdir src +git clone https://github.com/mit-pdos/xv6-public src +``` + +
+ +Hints and tips for Projects + +- `initial-reverse`: the error messages that are needed to pass the tests were wrong! The provided text said `"error: ..."` but the tests expected `"reverse: ..."` so make sure to match the tests' expectations in your code. +- `processes-shell`: I had to edit `/tests/3.pre` to use `/bin/ls` due to how it's set up on my system, in order to pass all the tests. Alternatively you can add `export PATH="/bin:$PATH"` to your `.profile` or `.bashrc` file. +- [hints for Project 1B: `initial-xv6`](Project-1B-initial-xv6.md) +- [hints for `scheduling-xv6-lottery`](scheduling-xv6-lottery.md) +- [hints for `vm-xv6-intro`](vm-xv6-intro.md) +
+ +## Resources + +### C + +Please don't try to learn C from sites like GeeksforGeeks, TutorialsPoint, or Hackr.io (we're not even gonna link to them here). Those are great resources for other languages, but C has way too many pitfalls, and C tutorials online are often filled with dangerous errors and bad coding practices. We looked at many C resources for the recommendations below and unfortunately found *many* bad or unsafe ones; we'll only include the best ones here, so look no further! + +We recommend learning C by working through (the entirety of) Jens Gustedt's [*Modern C*](https://modernc.gforge.inria.fr), which is [freely available online](https://hal.inria.fr/hal-02383654/document). This book is relatively short and will bring you up to speed on the C language itself as well as modern coding practices for it. Make sure to do all the exercises in the footnotes! + +While the book above is our default recommendation, we also recommend K.N. King's [*C Programming: A Modern Approach*](http://www.knking.com/books/c2/) as a second, more beginner-friendly option. It has some disadvantages: it's much longer (almost 850 pages), it's not available for free (and copies can be hard to find), and it's not quite as recent as *Modern C* (but still relevant nonetheless). That said, it has more exercises if you want extra practice, and the Q&A sections at the end of each chapter are filled with pearls of C wisdom and answers to C FAQs. It also covers almost the entirety of the C language and standard library, so it doubles as a reference book. + +CS 50 doesn't quite cover enough C for OSTEP, but if you've already taken CS 50, you can supplement it with the books above. + +Additional (***optional***) resources include: +* [CS 50 Manual Pages](https://manual.cs50.io): a great reference for looking up C library functions; most functions include both the usual manual as well as a beginner-friendly "less comfortable" option (just note that the "less comfortable" version uses `string` as an alias for `char *`.) +* [comp.lang.c FAQs](c-faq.com): answers to all of your questions about C minutiae. +* [cdecl](https://cdecl.org): a tool to translate C gibberish into English. +* [C track on exercism.io](https://exercism.io/tracks/C): additional practice exercises. +* [Secure Coding Practices in C and C++](https://www.amazon.com/dp/0321822137): if you want to understand why other C resources are so unsafe. +* [*The C Programming Language*](https://www.amazon.com/dp/0131103628): the original book on C by its creators. Too outdated for OSTEP, but a good read if you manage to find a copy. + +### x86 Architecture and Assembly Language + +Nand2Tetris has already introduced most of the concepts you'll need to understand systems and computer architectures, so now you just need to port that knowledge to the real-world (32-bit) x86 architecture. + +The easiest way to do that is by watching a subset of the lectures from the *Computer Systems: A Programmer's Perspective* course (or reading the matching chapters in the [textbook](https://www.amazon.com/dp/013409266X) of the same name). The lectures you'll need are: + +* [Machine-Level Programming I: Basics](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=6e410255-3858-4e85-89c7-812c5845d197) +* [Machine-Level Programming II: Control](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=fc93c499-8fc9-4652-9a99-711058054afb) +* [Machine-Level Programming III: Procedures](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=2994255f-923b-4ad4-8fb4-5def7fd802cd) +* [Machine-Level Programming IV: Data](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=03308c94-fc20-40d8-8978-1a9b81c344ed) +* [Machine-Level Programming V: Advanced Topics](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=3f0bf9ca-d640-4798-b91a-73aed656a10a) +* [Linking](https://scs.hosted.panopto.com/Panopto/Pages/Viewer.aspx?id=0aef84fc-a53b-49c6-bb43-14cb2b175249) + +Additional (***optional***) resources include: +* [CPU Registers x86](https://wiki.osdev.org/CPU_Registers_x86): good for looking up specific registers. +* [*PC Assembly Language*](https://pdos.csail.mit.edu/6.828/readings/pcasm-book.pdf): a short book on x86 assembly. +* [GCC Inline Assembly HOWTO](https://wwww.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html): a guide to writing assembly code inside a C program. +* [*Intel 80386 Programmer's Reference Manual*](https://pdos.csail.mit.edu/6.828/2018/readings/i386.pdf): the official (and huge) resourcefrom Intel. + +### xv6 + +You don't need to read anything about xv6 until after you start OSTEP; in fact, we recommend holding off on the xv6-related projects until you've finished the entire section on virtualization. After that, you'll need a guide to walk you through the source code. + +The xv6 authors provide a [book](https://pdos.csail.mit.edu/6.828/2018/xv6/book-rev11.pdf) that you can read alongside the source code. There's also a handy line-numbered [PDF version](https://pdos.csail.mit.edu/6.828/2018/xv6/xv6-rev11.pdf) of the code with an index to see exactly where each function or constant gets used. + +However, that book glosses over a lot of the details in the code that you might find challenging, including the advanced C features used, the x86 architecture- specific instructions, and the concurrency aspects (if you haven't finished that section of OSTEP before starting the xv6 projects). To solve this problem, we provide an [annotated guide to xv6](https://github.com/palladian1/xv6-annotated) that goes over the entire xv6 code and analyzes it line-by-line with explanations of the C features, hardware specs, and x86 conventions used. That means it's longer than the official xv6 book, so you don't have to read all of it (and you can probably skip the optional sections unless you care about device drivers), but you can use it as a reference if you're scratching your head about some part of the code. + +Also [here](https://www.youtube.com/playlist?list=PLbtzT1TYeoMhTPzyTZboW_j7TPAnjv9XB) is an excellent video series walking through much of the xv6 code. + +### Miscellaneous + +You'll need a general sense of how Makefiles work in order to use the Makefile for xv6. [This tutorial](https://makefiletutorial.com) covers much more than you need; just read the "Getting Started" and "Targets" sections and come back to the rest later if you need to look something up (but you shouldn't have to). + +Additional (optional) resources include: +* [GCC Command Options](https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Invoking-GCC.html#Invoking-GCC): a guide to command-line flags for the GNU C compiler `gcc`. +* [Linker Scripts](https://sourceware.org/binutils/docs/ld/Scripts.html#Scripts): a guide to writing scripts for the GNU linker `ld`. +* [OSDev Wiki](https://wiki.osdev.org): a great resource for all kinds of OS concepts and implementation details. +* [*Linux Kernel Development*](https://www.amazon.com/dp/0672329468): if you want to apply your xv6 knowledge toward contributing to the Linux kernel, this is a great read after OSTEP. diff --git a/coursepages/ostep/Project-1B-initial-xv6.md b/coursepages/ostep/Project-1B-initial-xv6.md new file mode 100644 index 0000000..8ef5731 --- /dev/null +++ b/coursepages/ostep/Project-1B-initial-xv6.md @@ -0,0 +1,35 @@ +## Project 1B + +### all thanks to [palladian](https://github.com/palladian1) + +### Linux Installation + +* Make sure you have a compatible compiler toolchain; if you're on Linux, gcc should work perfectly. +* Install qemu-system-x86 (may be called qemu-system-i386 or qemu-system-x86_64; note that on some distros, qemu is the wrong package). +* Install Perl. +* Install gawk. +* Install expect. +* Make a src/ directory in the same directory as the project's test script. +* Clone xv6 github repo and copy the source code in to your src/ directory. +* Inside src/, run `make qemu-nox` to test whether xv6 is working. Exit xv6 with `Ctrl-a x`; if you forget this, you can also kill the qemu process. It's worth checking `top` or `htop` to make sure qemu isn't running anymore; sometimes it can keep going after you exit and consume a lot of resources. +* Modify the Makefile to set `CPUS := 1`. +* Run `make qemu-nox` again to test that xv6 still works. + +### Instructions + +* Your task is to create a new system call for xv6, `getreadcount()`, that will return the number of `read` syscalls that have previously taken place. Note that the count should be a global count, not a per-process count. + +### Suggested Approach + +* Download the xv6 source code PDF (it's better organized there than in the code you downloaded). Read the table of contents to understand how sheets, pages, and lines are numbered. Then glance at the cross-references after that so you know how to find parts of the code if you need to. +* Take a (very) quick look at the portions of the xv6 source code listed under `processes` and `system calls` on the table of contents, as well as `usys.S` in the `user-level` section. Don't worry about understanding it yet; you just need to see where each file is in the PDF so that you can follow along with the discussion video later, since the professor's code has a different directory layout than yours will. +* Watch the video for discussion 2 on Project P1B, and follow along with your PDF copy of the xv6 code. Annotate it as the professor explains what each part does. +* Read the background section linked on the project's Github page, annotating the xv6 code PDF. +* Read through the xv6 PDF one more time, this time to get a general understanding of the `processes` and `system calls` sections, as well as `usys.S` and `user.h` (NOTE: the last one isn't included in the xv6 PDF, so you'll have to look at the actual code you downloaded). Don't worry about understanding every last line of code, just make sure you know where system calls are defined, how they're called, etc. +* Modify the xv6 source code to add the new `getreadcount()` syscall. You will need to modify several files; I suggest marking your modifications with `// OSTEP project` to make it easy to find them later for debugging. +* There is one other place you'll have to add code, which isn't included in the xv6 PDF: `user.h`. +* Once you're done, run the test script. Test 1 runs a function that will make several `read` calls, then calls `getreadcount`. In order for your code to work, you must correctly keep track of the total number of `read` calls made by all processes. +* If your code passes test 1, congratulations! You're done for now. Don't worry about test 2 until after you've watched the lectures on concurrency. +* If your code didn't pass test 1, you can compare the expected output in `tests/1.out` with your test's actual output in `tests-out/1.out`. You can also look at `tests-out/1.err` to check for any error messages. +* You can also test your code by loading up xv6 in your terminal with `make qemu-nox`. Type `ls` to see all files; you should see `test_1` and `test_2`. Run test 1 with `./test_1` to see what it prints out; you can compare that manually with the expected output. +* Once you've watched the lectures on threads, concurrency, and locks: test 2 checks whether your implementation of `getreadcount` is thread-safe. It probably wasn't before, so in order to fix that, you'll have to add a lock. Then you can run the test script again and check that your code now passes both tests. diff --git a/coursepages/ostep/Reading-order.md b/coursepages/ostep/Reading-order.md new file mode 100644 index 0000000..6629725 --- /dev/null +++ b/coursepages/ostep/Reading-order.md @@ -0,0 +1,13 @@ +### all thanks to [palladian](https://github.com/palladian1) + +* Before starting the course: `initial-utilities` (aka, project 1a) and `initial-reverse` (not assigned in class) +* Chapter 5: `processes-shell` (project 2a) +* Chapter 6: `initial-xv6` (project 1b, but only worry about test 1) +* Chapter 9: `scheduling-xv6-lottery` (project 2b) +* Chapter 24: `vm-xv6-intro` (project 3b) +* Chapter 28: `initial-xv6` (now pass test 2) +* Chapter 29: `concurrency-xv6-threads` (project 4b) +* Chapter 30: `concurrency-mapreduce` (project 4a) +* Chapter 33: `concurrency-webserver` (not assigned in class) +* Chapter 42: `filesystems-checker` (project 5a) +* (there is no 5b, and there are no projects for chapters 43-51) diff --git a/coursepages/ostep/Scheduling-xv6-lottery.md b/coursepages/ostep/Scheduling-xv6-lottery.md new file mode 100644 index 0000000..59d02da --- /dev/null +++ b/coursepages/ostep/Scheduling-xv6-lottery.md @@ -0,0 +1,202 @@ +## all thanks to [palladian](https://github.com/palladian1) + +### General Tips + +* Read chapter 9 in the OSTEP book and watch the video for discussion 5. Lottery ticket schedulers aren't discussed in the lectures, so you really do have to read the book for this one. + +* In general, you can't use C standard library functions inside the kernel, because the kernel has to initialize before it can execute library binaries. + +* The xv6 kernel has a "kernel version" of `printf`; it takes an additional integer argument that tells it whether to print to `stdout` or `stderr`. Note that it can only handle basic format strings like `"%d"` and not more complex ones like `"%6.3g"`; you can deal with this by manually adding spaces instead. It also has another similar function, `cprintf`. + +* If you do want to use other library functions that aren't available inside the kernel (pseudo random number generators), you can see how those functions are implemented in P.J. Plauger's book, The Standard C Library, and then implement them yourself. + + ### Implementation + + * You'll have to modify the same files you did in Project 1b in order to add the two new system calls. + * In order to understand how processes are created, remember that they start in the `EMBRYO` state before they become `RUNNABLE`--you'll have to find where that happens. + * System calls always have argument type `void`, so take a look at how system calls like `kill` and `read` manage to work around that limitation and get arguments (like integers and pointers) from user space. You might have to back a few steps in the chain that executes them. + * Make sure you're including `types.h` and `defs.h` wherever you need to access code from other parts of the kernel. + * In order to create the xv6 command `ps`, look at how `cat`, `ls`, and `ln` are implemented. Make sure to modify the Makefile to include the source code for your `ps` command. + + +## Spoilers below! + +### Solution walk through + +- Start from a fresh copy of the `xv6` source code. + +- `argint` and `argptr` are important functions. So `syscall`s take no arguments, but in reality, in user code you want to pass arguments to them. + +- So the way you do that is the kernel will call the `syscall`, say, `sys_kill()` with no arguments, then `sys_kill` will use `argint()` to get the arguments from the call stack, then pass that to a function `kill(int pid)`. + +- So you can see there's a bunch of `extern int sys_whatever` function declarations below that; that means that these functions are defined in another file and should be pulled in from there as function pointers. + +- And these `sys_whatever` functions are basically just wrappers for the real `syscall`, which doesn't have the `sys_` at the beginning. So you need to add `sys_settickets` and `sys_getpinfo` to that list of function declarations. + +- Then there's an array of function pointers; it's using this old-school C way of initializing arrays where you can do `int arr[] = { [0] 5, [1] 7}`. + +- And the names inside the square brackets `SYS_fork`, etc. are defined as preprocessor macros in another header file `syscall.h`. + +- So you need to add two more entries in the array with function pointers to `sys_settickets` and `sys_getpinfo`, and then you need to define `SYS_settickets` and `SYS_getpinfo` in the relevant header file. + +- So then all these `sys_` wrapper functions are defined in `sysproc.c`. + +- So there, you need to create `int sys_settickets(void)` and `int sys_getpinfo(void)`. + +- The real `settickets` function will need an int argument, so you need to use `argint` there to grab it from the call stack and pass it to `settickets`; similarly, `getpinfo` will need a pointer, so you'll use `argptr`. + +- Also, there's an extra condition in the if statement for `sys_settickets`; that's because you're not allowed to use a number of tickets below 1. + +- So then there's some assembly code that needs to run for each of the system calls; luckily, it's just a pre-written macro, so you don't have to write any assembly. that's in `usys.S`. + +- So you just add two lines at the bottom to create macros for `SYSCALL(settickets)` and `SYSCALL(getpinfo)` + +- Last part for the `syscalls`: you need to declare them in a header file for user code to be able to call them. that's in `user.h`. + +- So `struct pstat` will be properly defined in `pstat.h`, but you need to declare it in `user.h` as well so that user code doesn't complain when it sees it. + +- Basically, any user code that uses `syscalls` or C (really, `xv6`) standard library functions will have to include `user.h`. + +- So, so far, that's everything for the two system calls as far as the OS is concerned; now we just have to actually implement them with the regular functions `settickets` and `getpinfo`, then implement the scheduler and the `ps` program. + +- `pstat.h` is not for the scheduler, but for the `ps` program, which will work somewhat like the Linux `ps`. `pstat.h` is just to define the `struct pstat`, but there's no `.c` file to go with it. + +- So the scheduler will work by assigning 1 ticket by default to each process when it's created; then processes can set their own tickets using the `settickets` system call. + +- so first we need to make sure each process tracks its own tickets, then we need to assign a default of 1 ticket when creating them, then we need to write `settickets`. + +- the first part is in `proc.h`: processes are represented as a `struct proc`, so we add a new member for `int tickets`. + +- the `int ticks` member is for `ps`; I'll come back to that. + +- One other thing to note in `proc.h` is the `enum procstate`: you can see all the possible process states there. `EMBRYO` means it's in the process of creation; so what i did was `grep` for `EMBRYO` to find where the process was created in order to set the default tickets to 1. Turns out it's in `proc.c`. + +- Inside `proc.c`, there's a function `allocproc`, which initializes a process. + +- There's a process table called `ptable`, and `allocproc` looks through it to find an unused process. + +- Then when it does find it, it goes to create it; i added `p->tickets = 1;` there. + +- okay so the next change is to fit one of the requirements: child processes need to inherit the number of tickets from their parent process. + +- So child processes are created with `fork`, which is in the same file. + +- In `fork`, `curproc` is the current process, and `np` is the new process. + +- So i set `np->tickets = curproc->tickets`. + +- So the scheduler needs to generate a pseudo random number, then it should iterate through the process table with a counter initialized to 0, adding the number of tickets for each process to the counter. once the counter is greater than the pseudo random number, it stops and runs that process. + +- So I ended up looking in P.J. Plauger's The Standard C Library, which is just a big book of all the source code for the C library with commentary. It's pretty good; I don't know if it's still written that way though because the book is from the 80s. + +- So i just implemented C's `rand` and `srand` functions. `srand` sets a random seed (not so random, as you'll see later), then `rand` turns it into a pseudo random integer. + +- There's a bunch of type magic going on there between changes back and forth from integers to unsigned integers; that's to avoid signed integer overflow, which causes undefined behavior. unsigned integer overflow is okay though. + +- I only made one change to make it faster, which was to write `& 32767` instead of `% 32768`. + +- So you'll see the "random" seed i used: the number of `ticks`, which i think counts the number of timer interrupts so far. + +- Which is totally not random at all, since the first time this program gets run, it'll be 0, then 1, then 2, etc. + +- So there's some lines about counting `ticks`; that was for `ps`, not the scheduler. + +- The main change to make it a lottery scheduler is the counter variable. + +- And adding a for loop to count the total number of tickets that have been distributed. + +- So then at the very bottom of this file is the implementation of `settickets` and `getpinfo`. + +- So after initializing `counter` and `totaltickets`, there's for loop that counts the total numbers of tickets that have gone out to processes. + +- Then we get the winning ticket. + +- Let's discuss the original source code first. So first you acquire the lock. You'll release it at the very end. But in between, you have a for loop that iterates over all the processes in `ptable`. + +- Specifically, it iterates over only the processes in `RUNNABLE` state; if a process isn't `RUNNABLE`, it just `continue`s on to the next one. (This is for the round-robin scheduling mechanism that's already in the code.) + +- So now it's gonna switch to the very first `RUNNABLE` process it finds. Like, switching to executing it. + +- So first, `c` represents the current CPU. so it sets the current CPU to run the process it found with `c->proc = p;`. + +- Then it calls this function, `switchuvm(p)`, which sets up the virtual memory address space for `p`. Then it sets the process's state to `RUNNING`. + +- And then `swtch` is where the magic happens: that one swaps out the register contents of the OS and scheduler content with the saved-in-memory register contents of the process `p`. + +- So as soon as `swtch` executes, the CPU will continue executing instructions, but now they're the process's instructions. So this scheduler function just hangs there. + +- Eventually, when a timer interrupt goes off, the processor will use another `swtch` call but with the arguments reversed to swap the scheduler's register contents from memory into the CPU's registers and save the process's register contents. At which point execution will continue at this exact point. + +- So now `switchkvm` will set up the kernel's virtual memory address space. + +- These 5 lines are the context switch: + + ```c + c->proc = p; + switchuvm(p); + p->state = RUNNING; + + swtch(&(c->scheduler), p->context); + switchkvm(); + ``` + +- So then we go on to the next iteration of the inner for loop, which finds the next `RUNNABLE` process and repeats. + +- Only once we've executed all the `RUNNABLE` processes do we exit the inner for loop and release the lock. + +- Original source code is structured like this (this is pseudocode): + + ```python + while (1) { + iterate over processes: + if not runnable: + continue + run it + ``` + +- New code is structured like this (this is pseudocode): + + ```python + while (1) { + count the total tickets allotted to all processes // one for loop here + get the winning ticket number + iterate over processes: // another for loop here + if not runnable: + continue + add its tickets to counter + if counter <= winning ticket number: + continue + run it + ``` + +- We ignore the tickets of non-RUNNABLE processes. + +- So the tickets aren't numbered; each process just has a set amount of tickets, and we just count up until we've passed `n` tickets, where `n` is the winner. + +- For example if proc A has 5 tickets and proc B has 7, proc C has 2. if the winning number is 3, then A would run; if it's 8, then B would run; if it's 12, then C would run. + +- A winner in 0-4 would be A, 5-11 would be B, and 12-13 would be C. + +- So `settickets` is pretty basic: you just acquire a lock, set the tickets for the process, release the lock. + +- For `getpinfo` basically it works like this: + +- `p` is a pointer a `struct pstat`, as defined in `pstat.h`. each of its members is an array, with one entry per process. + +- Check for a null pointer. + +- Iterate over the process table and set `proc_i` to the i-th process. + +- Set the i-th entry of each member of `p` to the value for this process. + +- One last bookkeeping piece: we need to add declarations for `struct pstat` and the `settickets` and `getpinfo` system calls in `defs.h`. + +- And then the last file is `ps.c`, which implements the `ps` program, similar to Linux's `ps`. it just calls `getpinfo` to fill a `struct pstat`, then prints out the info for each process in use. + +- And then you just modify the Makefile to include `ps.c` in the compilation, and we're done! + +- Oh and this is why we needed the ticks in the scheduler: `ps` will print out how long each process has run. + +- So it needs to time the number of ticks that it actually executed. + +- FINALLY run `make qemu` in the `/src` directory to make sure it's all working. diff --git a/coursepages/ostep/vm-xv6-intro.md b/coursepages/ostep/vm-xv6-intro.md new file mode 100644 index 0000000..c86d635 --- /dev/null +++ b/coursepages/ostep/vm-xv6-intro.md @@ -0,0 +1,13 @@ +Credit goes to [palladian](https://github.com/palladian1) + +## Intro To xv6 Virtual Memory + +### WARNING: + +***The project doesn't match the currently available xv6 source code, which already has this project implemented in it!*** + +[palladian](https://github.com/palladian1) tracked down a different xv6 source from the Github page of a U of Wisconsin student. We had to edit the `Makefile` to find the QEMU executable correctly. We added `null.c` to the `user` folder (also edited `makefile.mk` there), which demonstrates the lack of memory safety. + +Start with the code in [`start.zip`](https://github.com/spamegg1/reviews/raw/master/courses/OSTEP/ostep-projects/vm-xv6-intro/start.zip). Extract it and run `make clean` and `make qemu-nox`. Then inside the xv6 system run `null` to see the lack of safety! If you want to compare the results of `null` with the actual machine code, you can run `objdump -d user/null.o`. + +You might have to manually run `make clean` and `make qemu-nox` every time you make a change to the code.