邪恶八进制信息安全团队技术讨论组's Archiver

EvilOctal 2005-12-2 04:45

[转载]Becoming Productive Quickly in the UNIX World

原始连接:[url]http://hacking101.sourceforge.net/Hacking101/t1.html#AEN12[/url]

文章中有大量链接 所以推荐大家使用原始连接所给地址看资料

Table of Contents
Introduction
Matters of Mentality
The Culture of the Hacker Community
Linux Distribution
Interacting with the Shell and Basic System Administration
Where to Find Documentation
Installing Software Packages
Programming Languages
Programmer's Editor: Where is My IDE?
Standard Libraries and Operating System Services
Building Programs: First Principles
Building Programs: Enter the Autotools
Version Control
Program Maintenance
Authoring Documentation
Contributing To Open Source Projects
Introduction
While I was working on an open source project, I recruited a number of university student volunteers who were interested in contributing to the development. These volunteers were no doubt talented and enthusiastic programmers --- I knew that because some of them were A-grade students in a course I taught. However, there was a problem: most of them grew up in a Windows/Mac culture, and were new converts to the UNIX world. To them, the entire GNU/Linux development environment is totally alien: command shell, man page reading, software building with make/autoconf/automake/libtool, and not to mention emacs and cvs. I ended up spending a considerable amount of effort in initiating them into the UNIX culture, and in assisting them to become productive developers even when they only have limited exposure to program development on UNIX platforms. After a while, I began to realize that there is actually a gap that needs to be filled --- these young hackers need a textbook to initiate them into UNIX hackerdom. And I intend this document to somehow fill this gap.

This document gives the reader a tour of what the UNIX development environment has to offer. It describes the components usually found in the toolboxes of open source developers, and points the readers to selected web resources that introduce the use of those tools. The goal of all these is to help the readers to become productive quickly in the UNIX world. This document can be used by a young hacker as a syllabus for a self-guided study, or it could also be used by a senior developer as a resource which he or she could refer to when an apprentice asks for help.

UNIX is more a family of technologies rather than a single platform. It is impossible and counter productive to introduce a young hacker to all the idiosyncrasies of various UNIX platforms. In this document, we will focus on the GNU/Linux platform, which usually is the UNIX platform most accessible to a young hacker coming from the Windows/Mac background.

Matters of Mentality
To become productive quickly in the UNIX world requires a certain kind of mentality. When I talk to young hackers struggling with the GNU/Linux development environment, I realize that the very thing that stumbles them are usually not technical. Rather, what they need is a new world view, a new understanding of what learning is all about. This section outlines some of the things that a young hacker needs to keep in mind while learning about the GNU/Linux development environment.


You don't need to know everything to be productive.

Omniscience is a very tempting proposition. It is quite understandable for a young hacker to want to understand everything before going to work, because it gives you a sense of competence. You might want to learn the inside out of a tool before using it. You might want to fully understand the code base of a software before patching it. Resist this urge. Such indulgence will get you nowhere. Firstly, it is impossible to know all the details. Secondly, you don't need all the details. You have to learn to cope with uncertainty. This is the first attitude you need to cultivate. Good hackers are those who can be productive even when they don't have the whole picture. Learn only as much as it allows you to begin working. Remember that, productivity, not knowledgeability, is the virtue you should be after.

Learn to learn in an unstructured manner.

We grew up used to to this whole idea of a crystal-ball style textbook covering everything you need to know about a subject in a systematic and articulate manner. Such a textbook is possible because the body of knowledge it covers is relatively stable, and so it is possible for someone to spend a reasonable amount of time synthesizing the knowledge into a comprehensive, systematic, and articulate view which we come to know as a textbook.

The bad news is, the era of the encyclopedic textbook has past, a long time ago, especially in the trade of hackerdom. The reason is simple. The computer world in general, and the GNU/Linux world in particular, is changing in a rate more rapid than anything we have known before. When someone has finally taken the time to do the necessary synthesis and produce a textbook, its content is already obsolete.

When you learn about the GNU/Linux development environment, you will realize that the documentation you need are scattered all over the web. No single document gives you the whole story. The burden placed on the learner is very demanding: you need to do the synthesis once performed by a textbook writer. Get used to this active style of learning. Learn to put pieces together. Do not expect one document to tell you everything. Ask a lot of questions when you read. If the document you are reading does not answer all your questions, then proactively seek alternative knowledge source to fill in the gap.

Stand on the shoulders of giants.

Avoid coding as much as possible. In many cases, the time spent on coding is better spent on researching if coding can be completely avoided.

Before you write anything, ask yourself the following questions:


Is there a project out there that does what I want my brain child to do?

Is there a library out there that can be reused to cut down the amount of coding?

Is there a tool out there that could generate part of the code for me?

If the answer to any of the above questions is yes, then do the obvious thing. That you know how to write a piece of code is not a sufficient reason for writing it. Coding is only the tip of the iceberg. There are hidden costs of maintaining a piece of code, testing it, making it usable, and documenting it. It is more productive to build your work on existing projects because behind every open source project is a whole community that does all the mentioned hard work for you.

Now, if no existing project does exactly what you want, then there are still things you can do to minimize the amount of coding:


Can you modify someone else's code to make it useful in your case? If this is possible, send a patch to the maintainer of the code.

If we are looking at only a couple of functions, and your modification is too specific to be accepted into the original project source tree, then consider stealing the code if its license permits you to do so. But do make sure you acknowledge the original author.


Read the source, Luke.

The education we receive in most college and university computer science programs focuses on writing programs. As a young hacker you need to train yourself in reading programs, especially programs written by seasoned hackers.

This, in fact, is the whole point of open source. The program source is provided so that you can read it. The best way to learn is to read about how people solve problems in a production software. You will find that you end up learning about things like naming convention, coding style, programming tricks, design patterns, arcane magic, etc. Plus, there is no shame in doing so.

When you dislike the way something is implemented, or when you suspect some bugs are lurking around, your first instinct as a hacker is to open up the hood and find out what is going wrong by reading the source. This is actually how you submit your first patch. Even if you are not able to fix the problem, the insight you gain will enable you to file a more intelligent bug report or feature request.

Don't be afraid to ask questions.

Your time is better spent on becoming productive than on solving trivial problems. Many of the technical problems that you will face have been solved over and over again by thousands of people. There is nothing sexy about you being able to solve it again without consulting others. You can demonstrate your competence in other arenas, like becoming a productive member of a development team. Therefore, if you have spent more than an hour solving a technical problem, it might be time to ask someone for help.

The Culture of the Hacker Community
Although you don't need anthropology to begin hacking, it will not take long for you to recognize that hackerdom is not just a set of technologies, but instead a culture by itself. You will find many of the customs less strange than they look if you have some understanding of how a hacker thinks. You don't need to read everything I suggest here at once. Just make sure you browse through some of them at your leisure.

The collection of articles known as The Cathedral and the Bazaar, written by Eric Raymond and published by O'Reilly, is the definitive guide for understanding how the hacker community operates, what hackers value, and why their work habits are so productive. Make sure you read the the following three articles:


The Cathedral and the Bazaar

Homesteading the Noosphere

The Magic Cauldron


Another great source of inspiration comes from the collection Open Sources: Voices from the Open Source Revolution published by O'Reilly. The content of the book is available online.

For a little bit of hero worship, you might want to check out this biography of Richard Stallman, founder of the Free Software Foundation: Free as in Freedom: Richard Stallman's Crusade for Free Software. Again, published by O'Reilly, this book is available online.

Linux Distribution
The first order of business for you is to acquire and install Linux on your personal computer. You are in no shortage of choices here: distrowatch.com is tracking 60+ distributions, and a summary of uniqueness is provided for each. Pick a major brand that is easy to install. I would suggest something along the line of Redhat, Mandrake, or SuSE. Download the ISO images from the distribution's web site, burn the CD's, and get your first Linux installed. Consult the distribution's web site for installation instructions. For the first time Linux users, I would also suggest you to avoid "nonstandard" hardware (e.g. labtops, integrated motherboard like the i810 chipset, etc).

Interacting with the Shell and Basic System Administration
If you come from a Windows/Mac culture, you are more used to interacting with the computer through a graphical user interface (GUI). When you develop, you work in an integrated development environment (IDE). In the UNIX world, you are offered more control to the underlying operating system. You usually interact with the OS through a command shell. In the case of Linux, most likely your default shell is bash, and is usually accessible by running one of the terminal programs like xterm, Gnome terminal or Konsole. If you want to do any real work, you better have a certain level of proficiency with the shell.

The Linux Zone of IBM's developerWorks web site offers a series of very well written preparation tutorials for the Linux Professional Institute's 101 certification exam. The tutorials go through the basics of common shell commands and system administration tasks:


LPI certification 101 exam prep, Part 1: Linux fundamentals

LPI certification 101 exam prep, Part 2: Basic administration

LPI certification 101 exam prep, Part 3: Intermediate administration

LPI certification 101 exam prep, Part 4: Advanced administration

Written by the developers of the Gentoo Linux distribution, these are superb tutorials for those new to the UNIX world. You will be asked to register in order to access the tutorial materials. For future reference, I would encourage you to download the PDF version of the tutorials. If you like this series, you might also want to check out other Linux-related tutorials written by the same authors.

Another good source of information is the Michael Stutz's The Linux Cookbook, published by No Starch Press under the Design Science License. This book contains quick recipes for performing various tasks on Linux, most of them performed from within a shell. A casual browsing of this book will give you an idea of Linux's capability.

Where to Find Documentation
There are a number of places you should look to locate Linux-related documentation:


Man Pages
UNIX commands usually come with reference documentation materials called "manual pages", or "man pages" in short. You can browse through these man pages with the shell command "man". For example, "man ls" will give you the man page for the command "ls".

Info Pages
Some UNIX commands also come with hypertext documentation called "info pages". These info pages can be browsed by using the info command, as in "info ls". The info page browsing facilities are integrated into the emacs editor, and so you may also browse the pages from within emacs. See the programming editor section for details.

The Linux Documentation Project
Free documentations in the form of "HOWTOs" and "Guides" can be found at the Linux Documentation Project web site.

Vendor Support
If you are using a major linux distribution, the vendor usually set up support web sites, user forums, documentation repositories, or other form of services that provide documentations specific to that distribution. For example, check out the following:


Red Hat Support and Docs

MandrakeExpert

MandrakeCampus

MandrakeForum

MandrakeUser

SuSE Support and Download

Installing Software Packages
If you need a certain piece of development tool or a library of some sort that is not already installed by your Linux distribution, then you will need to install it yourself. Most of the time you can get by without compiling the the package you need from the source. Modern Linux distributions usually offer some kind of automatic software package management system, the most common of which is the Redhat Package Management (RPM) system. Consult the documentation of your distribution to figure out how you can install new software.

In most cases, using the distribution's native software package management system is the preferred method for installing software. The distribution vendor has already sorted out package dependencies for you, customized and optimized the packages for your distribution, and solved all the potential problems you might encounter if you were to build the software from its source. Not repeating the vendor's work is just a corollary of the "standing on giant's shoulders" principle.

Unfortunately, some tools you need may not be available from the distribution, or, more likely, you need to work with a specific release of a tool that is different from the one offered by the distribution. In such cases, you will have to download the source code of the software, build and install it yourself.

Again, the Linux Zone of IBM's developerWorks web site offers a very nice tutorial that introduces the basics of building and installing software from source code:


LPI certification 102 exam prep, Part 1: Compiling sources and managing packages

Unlike what is said in the tutorial, you don't need to completely go through the LPI 101 series before you begin working on this tutorial. Remember, you don't need to know everything to be productive.

As an exercise, I would recommend you to try installing the latest stable release of GNU make, autoconf, automake and libtool under the prefix $HOME/local, where $HOME is your home directory. Then customize your $PATH so that the versions you installed is used instead of the those installed by your Linux distribution. Finally, uninstall everything and recover the original environment.

Programming Languages
Getting into a flame war concerning relative superiority of various computer languages is definitely not the goal of this section. So don't sue me if I say anything offending your programming language taste here. In this section we will describe the usual practice of the open source world: what languages are the preferred tongues (with respect to given tasks), and what you could do about it if you lack some of the background.

I will rounghly divide programming languages into two groups, namely application programming languages and scripting languages. By application programming languages I mean compiled languages people usually use for developing large applications. They include languages like C and C++. The C programming language, despite its age and low-level nature, is the application language of choice in the open source world. Most of the open source infrastructures are written in C and for C. You get excellent C support from many development tools like emacs, autoconf, automake, libtool, flex, bison, etc. Your C code will be understood by almost everyone in the community, and it is much easier to get help. Competence in C also allows you to participate in the development of countless exciting open source projects. If you need to brush up your C proficiency, you can check out the GNU C Programming Tutorial, commissioned by the Free Software Foundation. Note that it is still not completed yet, but you can access its CVS snapshot. Also check out this support request for installation instructions.

The C++ programming language is the next best supported application programming language in the open source world. Tool support is everywhere. The support for Java is not as established as in the case of C and C++, but it is getting there. In this tutorial, we will assume you are using C to develop applications.

Scripting languages are usually interpreted (although many can be compiled into some form of bytecode for efficient execution), and are designed for writing small job control or text processing scripts. Examples include shell scripting, Perl, Python, Ruby, etc. The distinction between application languages and scripting languages is ever blurring.

Some level of competence in shell programming is a must for any developer. You will need it for writing configuration, build and installation scripts. A nicely written, short tutorial series for bash shell programming is the following:


Bash by example, Part 1

Bash by example, Part 2

Bash by example, Part 3


Sometimes you need to cook up a small tool to do a job, preparing to throw away the tool afterwards. The Perl scripting language is especially good for this kind of purpose. Many such scripts end up acquiring a life of their own, evolving into reusable tools with wide user bases. Examples include automake, cvs2cl, etc. If you want to be able to debug or extend these tools, you definitely need some knowledge of Perl. A very nice introduction to Perl scripting can be found at the article archive of Perl.com:


Beginner's Introduction to Perl, Part 1

Beginner's Introduction to Perl, Part 2

Beginner's Introduction to Perl, Part 3

Beginner's Introduction to Perl, Part 4

Beginner's Introduction to Perl, Part 5

Beginner's Introduction to Perl, Part 6


If contributing to legacy scripts is not your priority, and you are starting a scripting project from scratch, there are usually much better choices than Perl. The programming languages Python and Ruby have more rational language design, and should be preferred over the aging Perl. Tutorial materials can be found here:


Python for Beginners

Ruby Documents

Programmer's Editor: Where is My IDE?
As a programmer, the first tool you need is a nice programmer's editor. There are many choices out there, each having its own faithful advocates: the more popular ones, in the order of how modern they are, are nedit, emacs, and vi (vim on Linux). Which one is the best is really a matter of personal taste. I use emacs, and so I will give you some pointers as to how you can use Emacs productively.

Emacs has a built-in tutorial that guides you through the basics of editing. Select "Emacs Tutorial" in the "Help" menu to activate the tutorial.

Emacs is self-documenting. Most of what you want to know about emacs can be found in the built-in info pages. Select "Read the Emacs Manual" under "Help" to browse the manual.

Make sure you check out the following features from the Emacs manual:


Major modes for editing programs

Program indentation

Compilation interface

Debugger interface (GUD)

Version control interface (VC)

Program merging interface (Emerge)

ChangeLog editing supports

Program tags tables (ETAGS)

Grep searching interface

Documentation lookup

Together, these features define an IDE-like environment.

Standard Libraries and Operating System Services
The standard C/C++ library on a Linux platform is GNU libc. It implements the ANSI C/C++ library standards and other interfaces to operating system services. The online manual is an indispensable reference.

A good introductory tutorial on Linux application programming is Advanced Linux Programming, by Mark Mitchell, Jeffrey Oldham, and Alex Samuel. The book is publish by New Riders under the Open Publication License, and complete content is available online.

Building Programs: First Principles
On Linux, you use the GNU Compiler Collection (GCC) to compile, link and profile programs, the GNU Project Debugger (GDB) to debug, and the GNU Make program to automate the build process. A gentle introduction on the entire process can be found on the whitepaper, An Introduction to C Development on Linux, from the Red Hat Developer Network:

The Linux GCC HOWTO "... covers how to set up the GNU C compiler and development libraries under Linux; gives an overview of compiling, linking, running and debugging programs under it."

To gain more understanding of how static and shared program libraries are built, consult the Program Library HOWTO, which "... discusses how to create and use program libraries on Linux. This includes static libraries, shared libraries, and dynamically loaded libraries."

The O'Reilly Network has a short tutorial series on software building with make:


Introduction to Make

Advanced Makefiles


The section An Introduction to Makefiles in the GNU Make Manual is a very nice introduction on writing makefiles. According to the manual, "if you are new to make, or are looking for a general introduction, read the first few sections of each chapter, skipping the later sections. In each chapter, the first few sections contain introductory or general information and the later sections contain specialized or technical information. The exception is section An Introduction to Makefiles, all of which is introductory." Also consult the relevant sections of the GNU Coding Standard for common Makefile convention.

If you are involved in a "real" development project, you will very likely be using GNU autoconf, automake and libtools. In such case, you don't really need to write makefiles: Makefiles with all the necessary commands for compiling, linking, and installing program pieces will be generated for you, automatically. However, you still need the knowledge covered in this section in order to understand how to use the "autotools".

Building Programs: Enter the Autotools


GNU Autoconf, Automake, and Libtool

Autoconf Manual

Automake Manual

Libtool Manual

Version Control
The Concurrent Version System (CVS) is the most popular version management system in the open source world. Not only does it tracks change history of a source tree, it also support concurrent modifications to the source tree by more than one developer at the same time, making it an ideal version control system for decentralized development. The fortunate fact is that CVS is a very well documented tool. There is no shortage of documentation available from the CVS home page. The first article-length tutorial one should check out is Jim Blandy's Introduction to CVS. It goes through the fundamentals of using CVS through examples, giving you some idea of what CVS is, and how it is used.

A very comprehensive guide to the usage of CVS in open source development projects is Karl Fogel's Open Source Development with CVS. Chapters comprising a complete introduction, tutorial and reference to CVS are released under the GPL, and are thus freely available online. The proprietary chapters deal with challenges and philosophical issues inherent in running an open source project using CVS. The book can be used both as a reference and a guide to the pragmatics of CVS.

The official CVS manual, Version Management with CVS, also known after the original author as "The Cederqvist", is so well written that it can be used both as a manual and a tutorial, and is the place you should go to when you need to check out the details of a CVS feature.

Authoring Documentation
Man Page


Linux Man Page HOWTO

Darwin man page HOWTO

页: [1]
© 1999-2008 EvilOctal Security Team