Task types

The BOS system supports a variety of types. Tasks are usually distinguished by the content of the archives; adding the required files is enough in most cases. One of the things that is always present in tasks are input files for the tests and solutions get a score for each test that show the correctness - a decimal number between 0 and 1. Normally, solutions produce output for every input file, which is stored in a files - the contestant's output file. That output is checked against the author's output file with this command:

  • diff -Z -B -q <contestant's output files> <author's output file> - it reports whether the files differ or are the same ignoring trailing spaces at the end of the lines and blank lines.

Keep in mind that the input and output files are sanitized by the system - \r and \r\n are replaced with \n.

Normally, the possible scores for the tests are only 0 and 1. In many cases, the check is more complicated than that or there are partial results (the default behaviour is to have only scores of 0 or 1 for each test). So, there is an option to use a program (the so-called checker program) by having a checker folder:

checker

checker.cpp

some other files needed for the compilation of the checker

The files in that folder are compiled together to create a binary file for the checker. It is important that in this case these, these files can indeed be compiled (don't forget to remove binary files, .o files, and such). If you have an already compiled checker, the content of the folder has to be only a binary file checker, a shell script checker.sh, or checker.jar. The checker program is expected to produce output that is only one decimal number between 0 and 1 - the solution's score for the test. All text in the standard error is considered a message for the test and is displayed to the contestants. If the scores can be partial, than it is important to tell this to the system by adding the precision property to the grade.properties file (see subsection 3 from other properties).

We divide the tasks into three categories. The simplest, most standard and popular tasks are in the first category - batch tasks. Tasks that are non-standard and require the contestant to give an output file to a problem or many output files to the tests are in the second category - output-only tasks. The last category (interactive tasks) is the newest and requires the author to make additional programs for managing the execution process and/or for compiling the contestant's program.

Batch

These are the classic tasks. The contestant's solution is a program that is compiled by itself to produce a binary file. Then the binary file is tested with the inputs that are stored in the input files and the corresponding outputs are compared to the author's outputs for the tests. No additional setting up is needed.

Output-only

This type of tasks don't require the contestant to submit a classic program. Instead the contestant submits the outputs for the tests. There is no real compilation and usually there's a checker to validate the outputs and output the scores and messages for the tests. If there will be decimal results, than it is possible that the results are rounded and the precision property should be added to the grade.properties file (see subsection 3 from other properties).

One file

In some cases, one file is submitted with instructions or a program in an esoteric programming language (usually a simple made-up language). That file has to be a .txt file, which has to be configured by adding extensions=txt to grade.properties. Here, a checker is needed to simulate the program on the tests. Another scenario is for a task where an approximate solution has to be found. Again, the contestant has to submit a .txt file to be evaluated by a checker. For everything to work fine, an input file still has to be added (it can be just empty file).

Multiple files

This is similar to the last scenario, except that there are multiple tests. This can be achieved by adding extensions=zip to grade.properties, which enables submissions to be ZIP archives. The inputs of the tests can be written in the statement or added to the contestant folder (see Tasks). It's important that the inputs' names have extension .in. Each submission is a ZIP archive that should contain at least one output file. The names of the output files should be the same as the names of the input files, only differing in the extension - .out instead of .in.

Interactive

This is the newest category of tasks and the most diverse. Usually, contestants must not read the whole input - they have to find it with some interaction, perform some action on it, or simulate multi-step interaction with hidden information between some steps. For this reason, contestants often have to only implement some functions and not the main function. The main function is written in a jury's program - the so-called grader program. That program is included by adding a system folder containing it:

system

grader.cpp

<task>.h - needed for compilation if there is only real interaction (the contestant's program calls functions from the jury's program)

some other files needed for the compilation

The important thing is that there is only one grader.cpp file in the task archive (the name of the folder is not important). Local graders for the contestants and other needed files can be included in the contestant folder (see Tasks).

Pseudo interactive

Header file is not needed if the task is pseudo interactive. This is used to restrict the solutions to be online. In addition, input and output are done by the jury's program which removes the contestant's ability to optimize that speed. Another case where the problem needs to be pseudo interactive is when the time for the input and output has to be removed from the total execution time to make differentiating solutions easier. Sometimes even tests can be generated during the execution of the grader and this time can also be removed.

This can be set up by measuring the time for removal in the grader (using C clock or C++ chrono) and writing this information in a special file. The information should be lines in the following format:

  • io-time: <time>, where <time> is a decimal number that specifies the removed time in seconds

There could be several such lines (for example, after the input is read and after the output is written). The system only considers the last one, so you have to always output the total time for removal. Furthermore, it's important to output that information after every grader step that is measured to have actual information even if the program ends abruptly.

The measured io-time is subtracted from the total execution time and that time is displayed in the system as execution time. The time limit for the task is set according to the subtracted time, so there is another time limit property just for the subtracted time - io_time (see subsection 2 from time limits). The real time limit that is passed to the sandbox for execution is the sum of the times in the properties time and io_time. Moreover, when io_time is set up, the main function of the grader program has a first argument which is the name of the file where the measured io-time has to be written (that is the special file mentioned earlier).

Manager

The described format is not very secure because the grader program is compiled together with the code of the contestant, so the two programs share memory space. There is an option to add another program that manages the whole process and is compiled independently - a manager program. The contestant's program is compiled either by itself or together with the files in the system folder (if grader.cpp is present there). If there are no further properties, the manager program and the contestant's program are connected with 2 pipes for communication, which is known only by the manager program (for the contestant's program, the ends of the pipes are just standard input and standard output). The input of the manager program is from the input files of the tests, and the output is considered the contestant's output.

Although this method for interactive tasks is secure, the downside is the time spent for piping the information between the manager and the contestant's program. This is slower than the usual reading and writing. If that communication is done just a small number of times, then the overhead is insignificant, but it becomes very significant when it is done many times. For this communication, it is better to have a grader that communicates with the manager. That grader becomes more of a stub program - only providing better communication between the contestant's program and the manager program.

The Manager program is added in the familiar way:

manager

manager.cpp

some other files needed for the compilation

Again, the important thing is the presence of manager.cpp file. There has to be only one such file in the task archive.

We have still not described how the manager program uses the pipes. Full specification of this is in the next section.

Multiple-run

Sometimes, tasks require the contestant's program to be run several times or the contestant's program must have several independent functions (run in separate executions). This is achieved by having a manager program and adding the property processes with the appropriate value (see subsection 5 from other properties). Now we can describe the full list of arguments that is passed to the main function of the manager program, if processes=n:

in_1 out_1 in_2 out_2 ... in_n out_n pid_1 pid_2 ... pid_n

The manager program is connected with n pairs of pipes to n instances of the contestant's program (which is the binary file created from the compilation of either the grader and contestant's solution, or the contestant's solution alone). The first 2n arguments are the file descriptors for the ends of those pipes and the last n arguments are the PIDs (process IDs) of the n instances. Those PIDs give the manager the ability to better 'control' the processes, for example, it can kill some of them.

In more detail, in_1 can be used to read data from the first instance of the contestant's program (which is the standard output for that program) and out_1 can be used to write data to it. Let the main function in the manager be: int main(int argc, char* argv[]). You can use the following syntax to work with the file descriptors:

  • C way - open the files through the descriptors

    FILE *in1 = fdopen(atoi(argv[1]),"r"), *out1 = fdopen(atoi(argv[2]),"w");

    argv[1] is the corresponding argument for in_1 and argv[2] is for out_1. The function fdopen is from #include<fcntl.h> and atoi is used to convert the string value to an integer value.

  • C++ way - make streams through the descriptors (this does not impact the speed of the reading and writing through the pipes)

    __gnu_cxx::stdio_filebuf buffer(atoi(argv[1]), std::ios::in);
    istream in1(&buffer);

    The class __gnu_cxx::stdio_filebuf is from the non-standard library #include<ext/stdio_filebuf.h>!

Now we can describe the whole picture. Everything is in one sandbox which first runs a 'parent' C++ built-in program piper.cpp. That program forks n instances of the contestant's program and for each makes 2 POSIX unnamed pipes for two-way communication that become standard input and output for the contestant's program.

In addition, for each of the instances, the following signal handler is used: signal(SIGPIPE,SIG_IGN). It ensures that those instances won't die if some of the pipes are closed. For this reason, it's also good to add this in the beginning of the main function of the manager. This is from #include <signal.h> and won't compile in Windows!

Lastly, the piper program forks one instance of the manager program, passes the described arguments to it and then listens to register errors from the forked child processes.

Keep in mind, that the piper.cpp in some moment has all of the pipes ends and that are additional 4n opened file descriptors. The default limit is 64 and you can set a higher limit by the open_files property (see subsection 7 from other properties).

Limiting global memory

This is another reason for a task to be interactive. Here, we are talking about limiting the memory to very small amounts that are not possible with the memory limit. One good way to achieve this is to run the contestant's program several times on different tests and check that the program makes the same steps for the same test. The downsides of this approach are that it requires the program to be deterministic and the overhead to achieve this is significant. If there is communication through the pipes many times, it can become very slow. So it can be a challenge to make the whole process efficient enough.

Another way to achieve this is using C++ syntax means - constant expressions. The contestant is required to write constexpr functions which are intended to be evaluated at compile time. This can be done if the arguments passed to those functions can be evaluated at compile time and also if all statements and expressions within those functions are can be evaluated at compile-time. So if the functions are truly constexpr, then they cannot use non-constant expressions from the global memory, i.e. they cannot save useful information in global memory between different calls.

The problem is that these functions are not required to be evaluated at compile time. The best practice is to make a small test during compile time to check that the result of the function is a constant expression. Unfortunately, it is still possible that everything compiles fine and the function uses global non-constant expressions because the test cannot traverse all branches when there are conditional statements.

One requirement for the constexpr functions is that the compiler has to know their definitions during the early compilation - before the linking stage. This means that the contestant's program has to be included in the jury's program - the grader. Technically, here only the grader program is compiled and it must include solution.h - the contestants submit header files instead of C++ programs. This has to be configured only by adding extensions=h to grade.properties. When this is configured, only the grader program is compiled, and the contestant's source is in that folder with the name solution.h. The only difference from the normal compilation is the additional flag to the compilation command: -fconstexpr-depth=2048, which makes the maximum limit of the depth for evaluating constexpr functions more generous.

Last updated