!!!

Sunday, March 18, 2012

What this it Slack Space, Unallocated Space & Magic Number

SLACK SPACE
Slack space is a form of internal fragmentation, i.e. wasted space, on a hard disk. When a file is written to disk it’s stored at the “beginning” of the cluster. A cluster is defined as a collection of logically contiguous sectors and the smallest amount of disk space that can be allocated to hold a file. Rarely will there be an even match between the space available in a cluster (or collection of clusters for longer files) and the number of bytes in the file. Left over bytes in the cluster are unused, hence the name slack space.


Slack space or sometimes referred to as file slack is the area between the end of a file and end of the last cluster or sector used by the file in question. Area is an area that will not be used again to store the information there, so the area is "wasted" useless. Slack space is common in file systems that use a large cluster size, while the file system that uses a small cluster size can organize the storage media more effectively and efficiently. Amount of wasted disk space can be thought is estimated by multiplying the number of files (including the number of directories) with half the size of a cluster.

Slack space refers to portions of a hard drive that are not fully used by the current allocated file and which may contain data from a previously deleted file.
Illustration of slack space on a hard drive

In the example above, saving a 768 byte file (named User_File.txt) requires only sector 1 and 1/2 of sector 2 in the cluster.  Depending on the operating system, the remaining 256 bytes in sector 2 might be filled with 1′s or 0′s or might simply remain intact.  Both sectors 3 and 4 would not be overwritten and are thus considered slack space.  If the slack space previously contained data from a deleted file, this information could be recovered with forensic tools. Additional Details Operating systems allocate files on a hard drive using clusters, which are a collection of contiguous sectors.  Because a cluster is the smaller allocation unit an operating system can address, if a file does not utilize the full cluster, a portion of the space remaining may not be overwritten and might contain data from a previously deleted file. For forensic analysts, it is important to understand that slace space is considered allocated space since it is part of an allocated cluster.  As such, special tools must be used to extract and analyse slace space.  An analysis of unallocated data will not contain any slack space data.

Unallocated Space
Unallocated space, sometimes called “free space”, is logical space on a hard drive that the operating system, e.g Linux, can write to. To put it another way it is the opposite of “allocated” space, which is where the operating system has already written files to.
Examples.
If the operating system writes a file to a certain space on the hard drive that part of the drive is now “allocated”, as the file is using it the space, and no other files can be written to that section. If that file is deleted then that part of the hard drive is no longer required to be “allocated” it becomes unallocated. This means that  new files can now be re-written to that location.
On a standard, working computer, files can only be written to the unallocated space.
If a newly formatted  drive is connected to a computer, virtually all of the drive space is unallocated space (a small amount of space will be taken up by files within the file system, e.g $MFT, etc). On a new drive the unallocated space is normally zeros, as files are written to the hard drive the zeros are over written with the file dataUnallocated space, sometimes called “free space”, is logical space on a hard drive that the operating system, e.g Linux, can write to. To put it another way it is the opposite of “allocated” space, which is where the operating system has already written files to.

Magic Number
One way to incorporate such metadata, often associated with Unix and its derivatives, is just to store a "magic number" inside the file itself. Originally, this term was used for a specific set of 2-byte identifiers at the beginning of a file, but since any undecoded binary sequence can be regarded as a number, any feature of a file format which uniquely distinguishes it can be used for identification. GIF images, for instance, always begin with the ASCII representation of either GIF87a or GIF89a, depending upon the standard to which they adhere. Many file types, most especially plain-text files, are harder to spot by this method. HTML files, for example, might begin with the string <html> (which is not case sensitive), or an appropriate document type definition that starts with <!DOCTYPE, or, for XHTML, the XML identifier, which begins with <?xml. The files can also begin with HTML comments, random text, or several empty lines, but still be usable HTML.

The magic number approach offers better guarantees that the format will be identified correctly, and can often determine more precise information about the file. Since reasonably reliable "magic number" tests can be fairly complex, and each file must effectively be tested against every possibility in the magic database, this approach is relatively inefficient, especially for displaying large lists of files (in contrast, filename and metadata-based methods need check only one piece of data, and match it against a sorted index). Also, data must be read from the file itself, increasing latency as opposed to metadata stored in the directory. Where filetypes don't lend themselves to recognition in this way, the system must fall back to metadata. It is, however, the best way for a program to check if a file it has been told to process is of the correct format: while the file's name or metadata may be altered independently of its content, failing a well-designed magic number test is a pretty sure sign that the file is either corrupt or of the wrong type. On the other hand a valid magic number does not guarantee that the file is not corrupt or of a wrong type.

So-called shebang lines in script files are a special case of magic numbers. Here, the magic number is human-readable text that identifies a specific command interpreter and options to be passed to the command interpreter.

Another operating system using magic numbers is AmigaOS, where magic numbers were called "Magic Cookies" and were adopted as a standard system to recognize executables in Hunk executable file format and also to let single programs, tools and utilities deal automatically with their saved data files, or any other kind of file types when saving and loading data. This system was then enhanced with the Amiga standard Datatype recognition system. Another method was the FourCC method, originating in OSType on Macintosh, later adapted by Interchange File Format (IFF) and derivatives.



No comments:

Post a Comment